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A Lobotomy Prognosis Scale’ 


Wesley C. Becker and Robert L. McFarland 
VA Hospital, Palo Alto, California 


Since the development of the prefrontal 
lobotomy technique by Moniz in 1936 and 
its introduction into this country by Free- 
man and Watts, much effort has been devoted 
to evaluating this therapeutic method. From 
a psychiatric viewpoint, evaluation has cen- 
tered on selection of suitabie candidates for 
lobotomy and on assessment of the changes 
produced by the operation. In this paper we 
are concerned primarily with the problem of 
selection. 


A review of the literature readily reveals 
many opinions and some experimental evi- 


1From VA Hospital, Palo Alto, California. 

2 Becker selected the items of this scale, tried them 
out on published case data, developed the original 
prognosis scale and its manual, collected and ana- 
lyzed the Canandaigua data, and served as a rater 
on the Palo Alto study. McFarland was in charge 
of the collection and analysis of the Palo Alto data, 
served as a rater at Palo Alto, and prepared the re- 
vised scale titled the Palo Alto Lobotomy Prognosis 
Scale and its manual. Dr. K. P. Jones and Miss 
Mary A. Hansen served as raters in the Palo Alto 
study and cooperated in the preparation of a pre- 
liminary report of that study. The authors wish to 
thank B. F. McNeal for his suggestions during the 
early phases of this study, S. D. Schultz for his as- 
sistance in making the improvement ratings, Robert 
E. Billings for his help in the statistical computa- 
tions, Mary Ann Swanson for her aid throughout 
the study, and, finally, the managers of the VA hos- 
pitals at Canandaigua, New York, and Palo Alto, 
California, who made this study possible. 

8 Authority is granted for professional workers to 
mimeograph, for their own use, the Palo Alto Lobo- 
tomy Prognosis Scale data sheets. Copies of the test 
forms, template, conversion table, and manual have 
been deposited with the American Documentation 
Institute. Order Document No. 4468 from ADI Aux- 
iliary Publications Project, Photoduplication Serv- 
ice, Library of Congress, Washington 25, D. C., re- 
mitting in advance $2.00 for microfilm or $3.75 for 
photocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 


dence on factors related to improvement 
after lobotomy. For example, such factors as 
the tortured self-concern syndrome, rapid 
onset, lack of deterioration, and good pre- 
morbid adjustment have been repeatedly re- 
ported as favorable prognostic indicators. 
The primary methods used in gathering such 
data are the psychiatric interview and his- 
tory, illustrated by the studies of Partridge 
(5) and Greenblatt et al. (2); and psycho- 
logical testing, such as by Scherer’s (6) study 
of memory, abstraction, and motor function- 
ing. Though the literature abounds with signs 
and some significant differences to guide in 
the selection of lobotomy candidates, there 
has been no systematic attempt to quantify 
the data in the form of a prognosis scale and 
to test the predictive validity of such a scale. 
This lack prompted the present study. 

This study attempted to construct a prog- 
nosis scale to permit more precise identifica- 
tion of patients who would improve as a re- 
sult of prefrontal lobotomy. To do so (a) 
the pertinent literature in the field was sur- 
veyed to find promising items of predictive 
significance; (6) these items were tried on 
certain published case histories; (c) the items 
which survived this procedure were combined 
into a preliminary scale and then used to 
evaluate, postdictively, 55 lobotomized pa- 
tients at the VA Hospital, Canandaigua, New 
York; (d) the scale was then cross validated, 
the stability of the cutoff score was tested, 
and rater reliability was established for the 
scale on 60 lobotomized patients at the VA 
Hospital, Palo Alto, California; and finally 
(e) the scale was revised and improved by 
essentially repeating steps c and d. Only the 
final form of the scale, titled the Palo Alto 
Lobotomy Prognosis Scale, is described here. 
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Methods and Subjects 
Preliminary Prognosis Scale 


A preliminary scale of 32 items was con- 
structed from factors found to have some 
predictive value by Greenblatt et al. (2) in 
their study of 205 lobotomy patients at the 
Boston Psychopathic Hospital. This source 
for the initial pool of items * was selected be- 
cause of the comprehensiveness of the vari- 
ables rated, the relative ease with which the 
data for rating could be obtained, and the 
general agreement of the Boston study with 
other studies. The items included aspects of 
premorbid personality, history, preoperative 
psychological functioning, behavior, and psy- 
chiatric diagnosis. Three mental status items 
were converted from ratings to psychological 
test scores. Other modifications of item defi- 
nitions were made when necessary to increase 
clarity. 

A trial run of the 32-item prognosis scale 
was made using data contained in 24 pub- 
lished case histories from the Columbia-Grey- 
stone study of cortical ablation (4). Some 
items were dropped because of their overde- 
pendence on the nature of the sample (e.g., 
religion, age, occupation). Others were modi- 
fied or eliminated because of difficulty in rat- 
ing or lack of a significant trend. The results 
were generally encouraging and 23 items 
were retained for further study. 


Scale-Construction Study 


To test the discriminating power of the 23- 
item prognosis scale and to determine a use- 
ful cutting score, it was applied in July, 1953 
to the records of 55 male neuropsychiatric 
patients lobotomized consecutively at the VA 
Hospital, Canandaigua, New York, between 
1948 and 1951. The scale items were rated 
from data in the clinical folders such as doc- 
tor’s progress notes, board reports, social 
service reports, psychological reports, and 
nurses’ notes. Where data were lacking, the 
total score was prorated on the basis of rat- 
able items. 

Criterion improvement ratings. Since the 
purpose of the study was to postdict improve- 
ment or nonimprovement of behavior of lo- 

4The word “items” is used for convenience only. 


Actually we are dealing with dichotomized continu- 
ous variables in many instances. 
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botomized patients, it was necessary to have 
a criterion measure which would assess both 
preoperative and postoperative behavior of 
these patients and which would yield some 
quantified measure reflecting the degree of 
change, if any. Ratings of improvement two 
to five years after operation were made on 
the basis of the following scale, borrowed 
from the Columbia-Greystone study (2, p. 
389): 


Rating of 5—on most disturbed ward. 

Rating of 4—on moderately disturbed ward. 

Rating of 3—=still institutionalized but on ground 
privileges. 

Rating of 2—at home under supervision. 

Rating of 1—at home and working or capable of 
working. 


The improvement measure was the difference be- 
tween the postoperative and preoperative score plus 
one bonus point (if appropriate) for each of the 
following postoperative characteristics: 

a. A change from a most disturbed ward to ground 
parole. 

b. A move from any place in the hospital to home. 

c. The ability to work at reduced capacity. 

d. The ability to work at former capacity. 


The maximum possible score was 8. 


The improvement criterion had shortcom- 
ings, but for the population for which it was 
designed and used, it appeared to be sensi- 
tive enough to allow gross evaluation of im- 
provement. 

Malignancy rating. A third measure, a pre- 
operative malignancy estimate, was made to 
shed further light on the nature of the prog- 
nosis scale. In the Columbia-Greystone study 
(2), it was found that malignancy ratings 
correlated moderately with improvement. 
Data obtained in applying the 32-item draft 
of the prognosis scale to the Greystone sam- 
ple suggested that this estimate might be 
thought of as a crude measure of malignancy 
when malignancy is defined and rated as fol- 
lows (2): 


1. Purely affective psychosis. 

2. Schizophrenic features plus considerable affect, 
little or no deterioration. 

3. Schizophrenic features with deterioration, little or 
no affect, inappropriate affect. 


It was tentatively assumed that these con- 
cepts crudely reflect an underlying trait con- 
tinuum. 
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Cross-Validation Study 


The design of the Palo Alto study was es- 
sentially the same as that at Canandaigua ex- 
cept that provision was made to check the 
reliability of the prognosis and criterion im- 
provement ratings, and to have the improve- 
ment ratings made by different judges. Four 
raters were each assigned 30 records from 
among the first 60 patients who received lo- 
botomies at the VA Hospital, Palo Alto. The 
raters included a staff psychiatrist of diplo- 
mate status, a staff psychiatric social worker, 
a staff psychologist, and a clinical psychology 
trainee. Two judges independently rated each 
patient on the prognosis and malignancy 
scales. An averaged rating was used in each 
case in the statistical computations. A re- 
search assistant rated each patient on the 
improvement scale and a second staff psy- 
chologist rerated the first 29 cases for reli- 
ability purposes. One brief training session 
and a preliminary rating manual were used 
to guide the prognosis ratings. 


Description of the Samples 


Some population characteristics of the Can- 
andaigua and Palo Alto samples are given in 
Table 1. From the malignancy ratings, it ap- 
pears that the Canandaigua sample was 
somewhat more deteriorated as a group. The 
differential improvement rates for the two 
samples is partly accounted for on this basis. 


Revision of the Scale 


On the basis of a chi-square item analysis 
the 23-item scale was revised. Using the avail- 
able data from three populations (Greystone, 
Canandaigua, Palo Alto), 14 of the 23 items 
were significant beyond the .05 level and 
were included in the final form, called the 
Palo Alto Lobotomy Prognosis Scale. Two 
additional items (sociability prior to illness 
and undertalkativeness) were also retained, 
though they did not show statistical signifi- 
cance. Since these items have been consist- 
ently found of value in other studies (9), it 
was felt that with predictive rather than post- 
dictive use of the scale, the data necessary to 
rate these factors reliably would be available. 
This assumption will have to be tested later. 
Items eliminated from the scale were of two 
kinds: items on which nearly al! lobotomy 


Table 1 
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Selected Population Characteristics of the Scale Con- 
struction and Cross-Validation Samples 

















Palo 


Canan- 
daigua Alto 
Variable sample sample 
N 55 60 
Age in years 
Mean 35.0 39.4 
Sigma 9.4 11.3 
Range 22-58 23-66 
Years from first 
hospitalization 
Mean 6.2 11.2 
Sigma 4.2 8.1 
Range 1.5-25 2.2-34.9 
Years since operation 
Mean 3.7 3.1 
Sigma 96 1.12 
Diagnosis 
Schizophrenic reaction 
Paranoid type 24 23 
Catatonic type 13 2 
Hebephrenic type 14 18 
Mixed type 3 
Manic-depressive reaction 
Manic (chronic) type 1 0 
Depressed type 0 3 
Malignancy rating 
1 0 2 
2 19 27* 
3 36 31* 





* Average of two raters. 


patients scored in the same direction, and 
items of extremely low reliability. The items 
of the final scale stated in a prognostically 
negative direction were as follows: 


1. No previous remissions (not home 3 months or 
more) (1) 5 

2. Irritable prior to illness (1 

3. Not sociable prior to illness (1) 

4. Failure to show temporary moderate to marked 
improvement following convulsive shock ther- 
apy (2) 

5. Undertalkative speech (1) 

. Incoherent ramblings and “word salad” speech 
(1) 


Oo 


7. Appearance shabby (1) 

8. Presence of auditory hallucinations (2) 

9. Absence of awareness of mental disorder (1) 
10. Absence of elated, depressed, or anxious affect 


(2) 


11. Presence of flat, apathetic mood (2) 


5 Weights are indicated in parentheses. Instructions 
for rating these items and for determination of prog- 
nosis scores are contained in the manual referred to 
in footnote 3. 
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12. Disoriented to time, place, or person (1) 
13. Wechsler-Bellevue A;zithmetic weighted score 6 
or under (1) 
14. Wechsler-Bellevue Information weighted score 6 
or under (1) 
15. Wechsler-Bellevue Memory Scale 80 or under 
(2) 
16. The following diagnoses only are weighted as 
indicated: ‘ 
a. Involutional pfychosis (0) 
b. Manic-depressire, depressed (0) 
c. Manic-depressive, manic (chronic not oscil- 
lating) (1) ' 
. Schizophrenic, 
. Schizophrenic, 
. Schizophrenic, 
. Schizophrenic, 


paranoid (1) 
catatonic (2) 
mixed (2) 
hebephrenic (3) 


ase & 


This scale was again scored for both sam- 
ples and the new cutting score tested. 


Results 
The 23-Item Prognosis Scale 


Reliability. Rater reliability of the 23-item 
preliminary prognosis scale was found to be 
.88 on the Palo Alto sample. Rater reliability 
of the criterion rating was .92 for the first 29 
cases in the Palo Alto sample. 

Validity data. In computing validity sta- 
tistics, two cutting scores had to be deter- 
mined, one on the improvement criterion and 
one on the prognosis scale. Successful re- 
sponse to lobotomy was defined as improve- 
ment to ground privileges or better, which 
cuts the criterion of improvement between a 
rating of 1 (“less of a management prob- 
lem’’) and 2 (“has some ground privileges’). 
The effect this definition of successful lo- 
botomy has on the usefulness of the scale will 
be discussed later. The cutting score on the 
new prognosis scale was based on a com- 
promise between the best statistical predic- 
tion (maximum true positives and true nega- 
tives) and an attempt to detect as many as 
possible that might improve to criterion 
(minimum false negatives). 

If we accept an improvement to ground 
privileges or better as adequate justification 
for lobotomy, and use a cutting score deter- 
mined from the Canandaigua sample by the 
above procedure, the following validation and 
cross-validation statistics obtain: 

If the 23-item prognosis scale would be 
employed (using our cutting score) to select 
patients for lobotomy, (a) the percentage of 
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successful operations for the Canandaigua 
sample would be raised from 22 to 38%. 
Only two patients who improved to criterion 
would be overlooked (false negatives), and 
the number of operations could be reduced 
by more than half; (5) on the Palo Alto 
sample with the same cutting scores, use of 
the prognosis scale would raise the percentage 
of successes from 35 to 58%. Five patients 
who improved would be overlooked, and 
again the number of operations could be re- 
duced by more than 50%; (c) the percent- 
age of successful predictions is 67% for the 
Canandaigua sample and 73% for the Palo 
Alto sample; ° (d) chi squares for each sam- 
ple are significant beyond the .01 level. 


Palo Alto Lobotomy Prognosis Scale 


The revised scale has the same interrater 
reliability (.88) as the original scale even 
though it is seven items shorter. Distribution 
data for the validation and cross-validation 
study are presented in Table 2. 

With the same procedure for determining 
the cutting scores as for the original scale, 
the following validation data obtain: (a) If 
this scale were used to select patients for 
lobotomy operations and the cutting score 
(as indicated in Table 2) derived from the 
Canandaigua data were used, the percentage 
of successful lobotomies in the Canandaigua 
sample would be raised from 22% to 35%. 
Only two patients who improved to criterion 
would be overlooked (false negatives) and 
the number of operations could be reduced 
by 50%. (6) On the Palo Alto sample if this 
scale were employed using the cutting score 
from the Canandaigua sample to select pa- 
tients for lobotomy, the percentage of suc- 
cesses would be raised from 35% to 59%. Two 
patients who improved would be overlooked, 
and the number of operations could be more 
than halved. (c) The percentage of success- 
ful predictions is 65 for the Canandaigua 
sample and 73 for the Palo Alto sample.’ (d) 


6 With a cutting score determined from the Canan- 
daigua sample to give the best statistical prediction, 
the percentage of successful predictions is 73% for 
Canandiagua and 78% for Palo Alto. 

7 With a cutting score (between 10 and 11) de- 
termined from the Canandaigua sample to give the 
best statistical prediction, the percentage of success- 
ful predictions is 78% for both samples. 
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Table 2 
Distribution of Cases for the Relationship of the 


Prognosis Score to Improvement 

















Canandaigua Palo Alto 
improvement improvement 
rating rating 
Prognosis = —__—_ ——- 

score* 0-1 2-8 0-1 2-8 
2-6 2 2 0 6 
7-10 5 5 3 5 
11-14f 10 3 10 8 
15-18 16 1 16 i 
19-23 10 1 10 1 
Total N’s 43 12 39 21 





_ * Lower prognosis scores and higher improvement ratings are 
in the favorable direction. 


t The cutting score on the prognosis scale falls between 14 
and 15, 
Chi squares based on the cutting scores indi- 
cated in Table 2 are significant beyond the 
.01 level for each sample. 


Malignancy Ratings 

The interrater reliability of the malignancy 
ratings was .70 for the Palo Alto sample. As 
was suggested from applying the preliminary 
scale to the Greystone sample, the prognosis 
scale shows a high relationship to malignancy. 
In the Canandaigua sample the biserial r was 
.79, and in the Palo Alto sample the product- 
moment r was .83.* So high a relationship 
indicates that the prognosis scale is pri- 
marily a measure of the degree or severity of 
illness. If it were not for its coarse nature and 
lower reliability, the malignancy rating by 
itself would be an adequate prognostic de- 
vice. It correlates with improvement just as 
highly (Pearson r, .50) as the prognosis scale 
(Pearson r, 48) for the Palo Alto sample. 
However the prognosis scale has the advan- 
tage of finer differentiation leading to better 
prediction. 


Implications 
Malignancy and Improvement 


Analysis of the relations between the prog- 
nosis scale scores and malignancy ratings 


8 The fact that the correlation between malignancy 
ratings and prognosis scores on the Palo Alto sam- 
ple is higher than the reliabilities would allow is 
possibly a function of the increased reliability ob- 
tained from averaging the two ratings. Chance fac- 
tors may also enter. 


strongly supports the idea that patients who 
improve as a result of prefrontal lobotomy 
are less mentally ill than those patients who 
do not profit from lobotomy. This idea is not a 
new one. Stotsky’s (7) study of remitting and 
nonremitting schizophrenics shows that re- 
mitters are more like severe neurotics on the 
Rorschach than nonremitters. The study of 
process and reactive schizophrenics by Kan- 
tor, Wallner, and Winder (3) leads to a simi- 
lar conclusion. Wittman’s work (8) with the 
Elgin Prognosis Scale also supports this the- 
sis. In the neurotic range of severity, Barron 
(1) has demonstrated that the neurotics who 
profit most from psychotherapy are better 
adjusted prior to treatment. The apparent 
truth of this thesis has the following implica- 
tions for further work in this area. First, the 
search for prognostic indicators should be 
more actively centered on variables which 
correlate with severity of illness. Items in this 
lobotomy prognosis scale suggest many areas 
for more intensive study. Second, it becomes 
essential, particularly with small samples, 
that studies of the differential effects of vari- 
ous therapies be controlled for severity of 
illness. The prognosis scale discussed in this 
paper might be useful for this purpose with 
some severely ill populations. 


Uses and Limitations of the Scal 


a. Present validity data are limited to male 
veterans in VA hospitals referred on some 
basis other than their scores on this scale as 
lobotomy candidates. 

56. Our definition of “success” on the cri- 
terion improvement rating has a definite ef- 
fect on the utility of the prognosis scale as 
a selection device. We accept as our defini- 
tion of success an improvement to ground 
privileges or better (two or more points on 
the improvement scale). If one chooses the 
less stringent criterion of better ward man- 
agement (an improvement of one point), the 
prognosis scale has little value in selection, 
since most patients under present selection 
methods improve this much. However, if the 
patient has a negative prognosis score in 
terms of our cutoff point, the scale would 
indicate that better management would be 
the most that could be expected. Then the 
decision to operate or not, under this pre- 
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diction of limited improvement, could be 
weighed against such factors as possible de- 
fects the lobotomy operation creates, the 
surgical risks involved, and the cost of the 
procedure. 

c. A favorable score on the scale does not 
insure success from lobotomy. However, the 
probability of success, accepting our criterion 
of improvement, can be stated to parents or 
relatives who must approve the operation. It 
must be pointed out that the revised progno- 
sis scale, from a statistical standpoint, rules 
out lobotomy operation with greater cer- 
tainty than it rules in the procedure. Roughly 
speaking, a score above the cutoff indicates 
with the odds of 13 to 1 that the patient can 
be expected to show Jittle or no improvement 
whereas scores below the cutoff indicate with 
odds of only 1 to 1 that a patient should 
demonstrate marked improvement. 

d. A final caution in the use of the scale 
involves the fact that all of the present data 
on the utility of the scale are based on post- 
dictive studies. Though there is no reason to 
believe the scale will not be useful in predic- 
tion, it has not yet been demonstrated. The 
fact that we can identify with greater cer- 
tainty those patients on whom operation 
should not be performed suggests that other 
types of psychological and/or physiological 
data are needed to identify with greater ac- 
curacy those patients who will improve as a 
result of lobotomy. 


Summary 


A series of studies was undertaken to de- 
velop and evaluate a lobotomy prognosis 
scale. Using historical, behavioral, and men- 
tal status material found to have some prog- 
nostic significance by Greenblatt et al. (2), 
we developed a preliminary 32-item scale. 
Modifications of the scale were made on the 
basis of data obtained in applying it to 24 
published case histories from the Columbia- 
Greystone study of cortical ablation. The re- 
sulting 23-item scale was then validated 
against the records of 55 lobotomy patients 
at the VA Hospital, Canandaigua, New York, 
and cross-validated on 60 lobotomy patients 
at the Palo Alto VA Hospital. Evaluation of 
improvement was made at the time of the 


study which was from two to five years after 
operation. Finally the scale was revised on 
the basis of a chi-square item analysis of the 
data from the three samples (Greystone, 
Canandaigua, and Palo Alto). 

The results indicated that the prognosis 
scale predicts to a practical degree both those 
patients who are likely to improve to ground 
privileges or better and those patients who 
demonstrate little or no improvement after 
lobotomy. The high two-rater reliability of 
the scale, and the cross-validation stability 
of the cutting score and percentage of suc- 
cessful predictions suggested that consider- 
able confidence may be placed in the re- 
peatability of the findings. Cautions and limi- 
tations are discussed in the text. 

In an attempt to clarify the underlying na- 
ture of the prognosis scale, ratings of malig- 
nancy were correlated with the prognosis 
scores. Correlations of about .80 found in 
three samples force a strong conclusion that 
the prognosis scale is a refined rating of se- 
verity of illness. 


Received August 19, 1954. 
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Psychosomatic Patients and Other Physically 
[li Persons: 


A Comparative Study’ 


Sheldon E. Waxenberg 
The Mount Sinai Hospital, New York, N. Y” 


Psychologists who work in general hospital 
settings frequently have occasion to evaluate 
patients’ responses in the specific light of 
psychosomatic theories. Awareness ihat these 
theories have not been subjected to rigorous 
experimental confirmation causes certain per- 
sistent questions to arise, especially if some 
patients are regarded as psychosomatic cases 
while others are thought of simply as physi- 
cally ill people. Are patients with diseases 
such as asthma and ulcerative colitis, which 
are widely referred to as psychosomatic be- 
cause they appear to have mixed psychologi- 
cal and physiological etiology, actually more 
like psychiatric patients than like patients 
with ordinary somatic illnesses? Are they 
more likely to suffer from depressed moods? 
Do they present characteristically distorted 
movement and color balances and give more 
bony anatomy responses on the Rorschach? 
Other questions also arise, concerned with 
whether or not the differences in psychody- 


1The data were collected at Mount Sinai Hos- 
pital, New York City, with the permission of the 
Director of Medical Research, Dr. Alexander B. 
Gutman, while the writer was a member of the psy- 
chology staff of Dr. Fred Brown. The persons at the 
hospital who expedited and encouraged this research 
are, regrettably, too numerous for individual men- 
tion. 

The present paper is part of a dissertation pre- 
sented in partial fulfillment of the requirements for 
the degree of Doctor of Philosophy at Teachers 
College, Columbia University, 1954. The writer 
wishes to express his thanks to Drs. Edward Joseph 
Shoben, Jr., Laurance F. Shaffer, Helen M. Walker, 
and M. Ralph Kaufman for their invaluable stimu- 
lation and advice. 

2 Now at Kings County Hospital, Brooklyn, New 
York. 


namics hypothesized for the distinctive psy- 
chosomatic maladies evidence themselves in 
projective test protocols. For example, are 
oral objects and processes more prominent in 
the records of asthmatics, and anal objects 
and processes more notable in the records of 
colitis sufferers? This paper constitutes an at- 
tempt at finding answers to these and other 
questions arising out of work with a diverse 
hospital population. 


Experimental Subjects and Procedures 


In order to carry out meaningful experi- 
mental comparisons of patients with psycho- 
somatic diseases, it is essential to choose dis- 
ease entities which are widely conceived of 
as within the psychosomatic category and 
which have distinctive systemic and dynamic 
features. Asthma and ulcerative colitis qualify 
in both these respects. For purposes of 
broader comparison, a group of subjects with 
a disease definitely not currently thought of 
as psychosomatic was chosen as an experi- 
mental control. Tumor patients appeared to 
qualify in this respect, for the etiology of 
malignant tumors has not been directly linked 
with psychological mechanisms or stress. Fur- 
thermore, patients with tumors are relatively 
as ill as the subjects in the other groups 


Subjects 


The subjects, 20 in each group, all women 
and all 23 to 46 years of age, were clinic and 
ward patients of one general hospital. The 
median age of the asthma group was 36.5, of 


the colitis group 32.5, and of the tumor group 
40.8 years. Because of a dearth of suitable 
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colitis subjects in the clinics and wards, the 
members of a hospital social club for people 
who have had ileostomies performed to con- 
trol ulcerative colitis were included in the 
colitis group. Only patients found to be cur- 
rently at least occasionally subject to asthma 
attacks were included in the asthma group. 
The colitis subjects had all been hospitalized 
at least once for ulcerative colitis. All the 
tumor patients had been definitively diag- 
nosed as suffering from malignant breast, 
uterine, abdominal, or lymphatic system tu- 
mors. No one was included whose hearing or 
sight was severely impaired or who was preg- 
nant, crippled, or receiving ACTH or corti- 
sone. Nor was anyone included who in addi- 
tion to one of the three selected diseases had 
a medical history of any other crucial dis- 
ease.® 


Psychological Tests Used 


The battery of five tests used in this study, 
which, except for the omission of a lengthy 
intelligence scale, resembles a conventional 
diagnostic battery, comprised the following: 


The Bender-Gestalt Test. The Bender records 
were scored by the experimenter in accordance with 
Pascal and Suttell’s standardization (5) after pre- 
scribed practice with their method had resulted in 
a product-moment correlation of .89 with Pascal’s 
scores for the 20 sample records in his manual. 
Pascal reports a correlation of .90 between his own 
and his collaborator’s scorings of 120 records. 

The H-T-P Test. Drawings of house, tree, and 
person were supplemented by another drawing. 
After the first human figure was finished, the sub- 
ject was requested to draw a figure of the opposite 
sex. 

The Rorschach test. The Rorschach records were 
scored in accordance with Klopfer’s method (4). 
The proportion of agreements of scoring judgments 
made by the experimenter and by another psycholo- 
gist * on 20 random records ranged from .59 to .93 
for explicit agreements on individual color and 
movement determinants. When implicit agreements 
—instances in which neither scorer considered a 
response to involve a certain determinant—were 
included in the computations, the proportion of 
agreements ranged from .71 to .98 on individual 


8The subjects were screened for an acceptable 
level of verbal fluency. Some differences were found 
in vocabulary scores, educational attainment, and 
ethnic composition of the groups, but these were in- 
vestigated wherever possible and found not to affect 
critically the major purposes of the study. 

* Barbara R. Waxenberg independently rescored a 
portion of the records. 


determinants. The human and animal movement re- 
sponses were judged to fall into one of three cate- 
gories—active, static, or passive—in accordance with 
a classification scheme based in large measure on a 
discussion of movement responses by Phillips and 
Smith (6, pp. 75-77). Scorer reliability on these 
classifications ranged from .76 to 1.00 on explicit 
judgments and from .82 to 1.00 when implicit judg- 
ments were also taken into account. The Rorschach 
responses were also judged for oral, anal, and bony 
anatomy content. A response was considered oral if 
it involved (a) foods or beverages, (6) culinary 
tools or procedures, (c) body parts related to food 
intake, (d) oral activities such as biting, sucking, 
and licking, or (e) states of hunger or thirst. A re- 
sponse was considered anal if it involved (a) excre- 
tive functions or appliances, (b) fecal matter, filth, 
dirt, the implication of a morbid or disgusting tex- 
tural quality or viscosity, or terms such as “oozing,” 
“squishing,” etc., or (c) posterior anatomy, either 
internal or external. A percept was considered a 
bony anatomy response if it involved an anatomical 
part, either animal or human, with a specifically 
bony structure. Scorer reliability on these cate- 
gories ranged from .71 to 81 for explicit judgments 
and from .98 to .99 for all judgments. 

Word associations. Rapaport’s revised list of 60 
word association stimuli was used (7, p. 84). The 
responses were judged by the same criteria as those 
for the oral and anal Rorschach content, and the 
number of each kind of response was totaled for 
each subject. Twenty random records were rescored 
by another judge. The two scorers agreed on .85 of 
the oral and .82 of the anal responses. With im- 
plicit agreements included, the over-all agreement 
was .97. 

The Thematic Apperception Test. Ten TAT cards 
(1, 2, 3BM, 4, 8BM, 12M, 13MF, 15, 18GF, 16), 
always in this order, were used. Each story was 
rated for emotional tone in accordance with a scale 
devised by Eron (2, Appendix A). An algebraic sum 
was computed for the ratings of the ten stories of 
each subject. The resultant score for every one of 
the 60 subjects was negatively signed, indicating a 
weighting toward sad stories, which also prevailed 
in Eron’s studies. Because of this uniformity, the 
signs were dropped and a set of mood scores was 
obtained which ranged from 1, indicating a neutral 
tone, to 14, reflecting a rather deep dysphoria. 
Twenty random records were rescored by another 
judge. The product-moment correlation of the two 
sets of ratings was .82. Eron obtained correlations 
of .86 between raters in his own studies (2, 3). 


Hypotheses and Results 
Certain hypotheses were implied in the 


questions raised in the introductory para-. 


graph. These and other hypotheses suggested 
by theories and intuitions about personality 
dynamics which are applied and found em- 
pirically useful in clinical work are here 
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stated explicitly and tested by statistical 
methods. 


Psychiatric Patient Status on the Bender- 
Gestalt Test 


Pascal’s scoring method for the Bender- 
Gestalt (5), which differentiates psychiatric 
clinic and hospital patients from nonpatients, 
was used in this study to test the hypothesis 
that the behavior of a person suffering from 
a psychosomatic disease is more likely to re- 
semble the behavior of a psychiatric patient 
than is the behavior of an individual with a 
disease which is not spoken of as psychoso- 
matic. All subjects were dichotomized ac- 
cording to their standard scores, with a score 
of 67, the point at or beyond which only 5 in 
100 normal persons’ scores would be expected 
to fall, serving as a cutoff point. Low scores 
which placed them in the nonpsychiatric 
category were obtained by 7 asthma, 13 
colitis, and 5 tumor subjects. The balance of 
the 20 subjects in each group fell in the class 
resembling psychiatric patients. When the 
asthma and colitis groups were compared by 
chi square, a significant difference did not 
appear.® Since they could then be regarded 
as drawn from the same population, they were 
combined into a composite psychosomatic 
group for comparison with the control tumor 
group. The difference between these groups 
was not significant either. Hence, it cannot 
be concluded that people with psychosomatic 
diseases differ from other ill people in respect 
to psychiatric status. 


Mood Scores on the Thematic Apperception 
Test 


Eron’s rating scale for emotional tone of 
TAT responses was designed to discriminate 
between stories expressive of aspirations and 
pleasant expectations and stories of fearsome, 
frustrating, and violent events associated with 
feelings of guilt, regret, and resignation. Psy- 
chosomatic patients are considered to be 


5 Yates’ correction for continuity was applied 
whenever a least expected frequency was less than 
10, except where its application would operate to 
increase chi square. The .05 level of confidence is 
accepted as defining significance throughout this 
study. 
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immobilized by conflicting needs and motiva- 
tions which produce dysphoria. The hypothe- 
sis that such patients are more subject to 
dysphoric moods than are other sick people 
was tested by means of dichotomization of 
the subjects on the basis of TAT mood 
scores falling above or below the median of 
9.4. Scores indicating greater dysphoria than 
the median score were obtained by 10 asthma, 
11 colitis, and 8 tumor patients. Group com- 
parisons by means of chi square, similar to 
those described in the preceding section, pro- 
duced no evidence to indicate that dysphoria 
is more commonly a concomitant of psycho- 
somatic illness than of other physical illness. 


Sequence and Size of Male and Female Fig- 
ure Drawings 


In psychosomatic formulations, asthmatics 
are regarded as passive, dependent, and very 
much identified with supportive maternal fig- 
ures, while women afflicted with ulcerative 
colitis are regarded as assertive and domi- 
nating and often career oriented. The per- 
sonal projections obtained in human figure 
drawings were used to test these psycho- 
sexual differentiations. It was hypothesized 
that women asthmatics, in carrying out the 
figure-drawing task, draw the female figure 
before the male figure and tend to draw the 
female larger in vertical dimension than the 
male figure, while women with ulcerative co- 
litis reverse the process, drawing the male first 
and larger. The female was drawn first by 13 
asthma, 16 colitis, and 14 tumor subjects. 
Pairs of figures were regarded as similar in 
size whenever the difference in their vertical 
dimensions, measured to the nearest half 
centimeter, was less than 1.5 centimeters. In 
only 39 cases were there differences this large 
or larger. The female figure was drawn larger 
by 11 asthmatics as against 4 who drew the 
male larger, by 6 colitis patients as against 6 
who drew the male larger, and by 10 tumor 
patients as against 2 who drew the male 
larger. When the asthma and colitis groups 
and then the combined psychosomatic and 
control groups were compared on each of 
these variables by means of chi square, no 
significant differences emerged. 
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Hypotheses Concerned with Rorschach Meas- 
ures 


Consideration was given to the factor of 
total productivity before counts of Rorschach 
determinants were used for testing experi- 
mental hypotheses. The number of responses 
in each of the 60 Rorschach records ranged 
from 7 to 43. The median fell between 15 
and 16. Fewer than the median number of 
responses were given by 14 asthma, 10 colitis, 
and 9 tumor patients. Chi squares offered no 
basis for doubting the homogeneity of these 
groups with respect to total responsiveness. 
However, since the number of responses in 
an individual record is correlated to some ex- 
tent with each of the major determinants, it 
is advisable wherever practicable to examine 
each hypothesis in such a way as to impose 
a degree of control on productivity. The 
method adopted here is to test a hypothesis 
first with only those records containing fewer 
than the median number of responses, then 
repeat the test with only those records hav- 
ing more than the median number of re- 
sponses, and finally repeat the test using all 
the records. 

Coarctation. When both the total of hu- 
man movement responses and the weighted 
color sum fail to exceed one, a record is con- 
sidered to be coarctated. Observation in psy- 
chiatric clinics devoted to treatment of psy- 
chosomatic cases suggests a prevalence of 
coarctation in Rorschach productions of this 
group. The question arises, however, whether 
it is primarily a matter of the patient’s being 
physically ill in general or a matter of his 
having a disease regarded as psychosomatic. 
It was here hypothesized that Rorschach rec- 
ords of psychosomatic patients are more 
likely to show coarctation than are the rec- 
ords of other physically ill persons. Coarcta- 
tion was found in the shorter records of five 
asthma, three colitis, and four tumor pa- 
tients and in the longer record of one tumor 
patient. Chi squares revealed no significant 
differences in comparisons of the asthma and 
colitis groups or of the combined psychoso- 
matic and control groups, when shorter or 
longer or all records together were considered. 

Human movement and color response pat- 
terns. Instead of the coarctative manifesta- 
tions described above, a somewhat different 


patterning of color and human movement re- 
sponses is considered by Phillips and Smith 
(6) to typify the psychosomatic patient. 
Their assertions are embodied in the hy- 
pothesis that psychosomatic patients typi- 
cally give no more than one human move- 
ment response but obtain weighted color 
sums of between three and five. The number 
of subjects in this study who show pre- 
cisely this pattern is very small. Three mem- 
bers of the control group produced the pat- 
tern in longer records; one asthma and two 
colitis patients did so in shorter records. For 
further testing of the hypothesis, the two 
components of the pattern were investigated 
separately. Of those producing shorter rec- 
ords, 13 asthma, 8 colitis, and 9 tumor pa- 
tients obtained color sums of less than three; 
while 1, 2, and 0 patients, respectively, ob- 
tained color sums of between three and five. 
Of those producing longer records, 5 asthma, 
1 colitis, and 5 tumor patients obtained color 
sums of less than three; while 1, 7, and 5, 
respectively, obtained color sums of between 
three and five. The total color sums were 22.0, 
54.5, and 48.5 for the respective groups. Two 
longer colitis records and one longer tumor 
record show color sums between 5.5 and 7. 
These three records were excluded from the 
tabulations; however, had they been in- 
cluded, the same statistical results would 
have been obtained. When the shorter rec- 
ord asthma and colitis categories were com- 
pared and when they were combined for com- 
parison with the control group, no significant 
chi squares were obtained. When the longer 
records, or all the records, of the asthma and 
colitis subjects were compared, however, chi 
squares significant at better than the .05 level 
of confidence were obtained, indicating that 
colitis subjects are more likely to give nu- 
merous color responses than are asthmatics. 
Comparisons of the tumor group with the 
asthma group and with the colitis group, each 
taken separately since lack of homogeneity 
forestalls combining them, offered no evidence 
of differences between the individual psycho- 
somatic groups and the control group. 

Now, what may be said of the other half 
of the pattern? Of those producing shorter 
records, 8 asthma, 7 colitis, and 7 tumor pa- 
tients gave one M response or no M at all; 
































while 6, 3, and 2 patients, respectively, gave 
more than one M response. Of those produc- 
ing longer records, 5 tumor patients gave no 
more than one M response; while 6, 10, and 
6 patients, respectively, gave more than one 
M response. The respective groups produced 
totals of 51, 47, and 28 M responses. When 
comparisons similar to those described above 
were made on the human movement variable, 
only one significant difference was found, 
that between the longer record subjects of 
the composite psychosomatic group and of 
the control group. However, the direction of 
the difference was opposite to that hypothe- 
sized. About half of the 11 control group 
subjects who gave longer protocols fall in 
the 0-1 M category, whereas not one of the 
16 psychosomatic subjects who gave longer 
records had such a small number of M re- 
sponses. From all these findings, there would 
appear to be very little reason for believing 
that the color-movement pattern proposed by 
Phillips and Smith is characteristic of psy- 
chosomatic records. 

Active and passive ideation in movement 
responses. Human and animal movement re- 
sponses on the Rorschach were separately 
differentiated as being active, static, or pas- 
sive ior the purpose of testing the hypothesis 
that asthma patients predominantly project 
basic passivity and dependency needs while 
colitis patients reflect rather strong strivings 
and aggressive drives in their fantasy pro- 
ductions. After rather complex categoriza- 
tions were made, chi-square tests on the re- 
sultant data brought to light no significant 
differences between the groups. 

Static human movement responses. Static 
human movement responses reflecting inhibi- 
tion of hostile activity and resultant tension 
in situations of conflict between activity and 
passivity indicate psychosomatic symptoma- 
tology according to Phillips and Smith (6, p. 
77). The median number of such responses 
per record was between O and 1 for the sub- 
jects of this study. The records of two 
asthma, eight colitis, and three tumor sub- 
jects contained at least one static M re- 
sponse. When chi-square tests were made on 
the basis of the presence or complete ab- 
sence of static M’s, no substantiation was 
obtained for the hypothesis that psychoso- 
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matic patients are more likely to produce 
static M’s than are other physically ill 
people. 

Bony anatomy. In an extensive review of 
types of Rorschach content, Phillips and 
Smith (6, pp. 124-127) declare that bony 
anatomy content is an index of the extent to 
which psychosomatic patients are engaged in 
inhibiting the expression of hostile impulses 
which are ultimately expressed through soma- 
tization. The hypothesis that psychosomatic 
subjects are more likely to produce bony 
anatomy responses than are other sick per- 
sons was tested by classifying the subjects 
according to whether they gave no bony 
anatomy responses or at least one bony 
anatomy response, the median lying between 
these limits. One or more such responses were 
given by 11 asthma, 9 colitis, and 11 tumor 
patients. The responses total 24, 11, and 32 
for the respective groups. Chi squares gave 
no evidence of significant differences. 

Oral and anal Rorschach content. It is hy- 
pothesized that asthmatics, considered to be 
conflictually absorbed with problems of suc- 
corance and maternal protection, fill their 
Rorschachs with inordinate amounts of oral 
content, while colitis patients produce an un- 
usual amount of anal content, either as an 
expression of deep, hidden preoccupations 
and motivations or as a reflection of pressing 
concern with the problem of fecal elimina- 
tion. The median number of oral responses, 
and also of anal responses, for all subjects 
was between 0 and 1. One or more oral re- 
sponses were given by 9 asthma, 10 colitis, 
and 11 tumor patients, for group totals of 
12, 21, and 20, respectively. One or more anal 
responses were given by 1 asthma, 1 colitis, 
and 5 tumor patients, for group totals of 1, 
3, and 12. Chi-square tests revealed no sig- 
nificant differences. 


Oral and Anal Responses to the Word As- 
sociation Test 


The hypothesis involving oral and anal re- 
sponses was also investigated by means of a 
word association list (7, p. 84) in which the 
total number of such responses bulks much 
larger and might be expected to afford a 
better differentiation of the groups. Oral re- 
sponses ranged from 3 to 13 per record, with 
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the median between 8 and 9. More than 8 
oral responses were given by 10 asthma, 11 
colitis, and 11 tumor patients. Group totals 
were 177, 174, and 172, respectively. The 
number of anal responses ranged from 0 to 
5 per record, with the median between 2 and 
3. More than 2 anal responses were given by 
7 asthma, 10 colitis, and 9 tumor patients. 
Group totals were 42, 52, and 52. Chi-square 
tests failed to sustain the hypothesis derived 
from psychodynamic and symbolistic theoriz- 
ing about the expression of fixated pregenital 
oral and anal drives. 


Discussion 


The primary purpose of this study was to 
lend the support of experimental evidence to 
clinical impressions about people with psy- 
chosomatic diseases. It is widely held that 
they constitute a unique group, different 
from normal, healthy persons as well as from 
persons physically ill in the ordinary sense. 
Moreover, the psychodynamic forces at work 
in various such diseases are said to differ in 
particular ways. By means of a conventional 
test battery, these assumptions were investi- 
gated, without any evidence for their sound- 
ness being adduced. Such consistently negative 
findings cast doubt upon the basic formu- 
lations which set the direction for psycho- 
somatic practice, therapeutic as well as diag- 
nostic. The selection of a presumedly normal 
but physically ill, rather than a normal and 
physically healthy, control group placed the 
focus on two polarities of physical illness, 
psychosomatic and nonpsychosomatic. An at- 
tempt at delineating the differences between 
the two has rendered questionable the theo- 
retical differentiations which have been posited 
in the literature. Dunbar, almost twenty 
years ago, implied the inevitability of a re- 
synthesis in thinking about the two when 
she wrote, “When medicine has apprehended 
the psychosomatic problem and assimilated 
it, the adjective will be obsolete: all medicine 
will be psychosomatic” (1, p. xx). 

A question about the optimum utilization 
of the obtained data deserves attention. 
Might not global clinical evaluations of such 
phenomena as moods, identifications, preoc- 
cupations, and so forth, lead to different or 
at least more comprehensive conclusions than 


did the disparate, atomistic measures used in 
this study? An experimental design requiring 
a number of experienced psychological clini- 
cians to rate certain selected crucial variables 
on the basis of all the material in each sub- 
ject’s complete protocol of responses might 
make fuller use of the data. Such an alterna- 
tive approach, which is being planned, might 
afford a more convincing and thoroughgoing 
examination of the psychosomatic theories. 
In any event, it is clear from the foregoing 
findings and discussion that all major psy- 
chosomatic formulations need to be subjected 
to rigorous and searching analysis by a va- 
riety of techniques. 


Summary 


To investigate certain hypotheses about 
psychosomatic disease, experimental groups 
of 20 asthmatic women and 20 women with 
histories of ulcerative colitis and a nonpsy- 
chosomatic control group of 20 women being 
treated for malignant tumors, all between 23 
and 46 years of age and screened to elimi- 
nate patients with multiple diseases, were 
given a test battery including the Bender- 
Gestalt, figure drawings, Rorschach, word as- 
sociations, and ten TAT cards. 

Bender-Gestalt scores gave no evidence 
that psychosomatic patients differ from other 
sick people in psychiatric status. TAT mood 
scores gave no indication that psychosomatic 
patients are more subject to dysphoria than 
are other physically ill persons. Rorschach 
records revealed no significant evidence that 
psychosomatic patients are more likely to 
(a) give coarctated records, (0) give fewer 
human movement responses, (c) obtain higher 
weighted color sums, (d) produce static hu- 
man movement responses, or (e€) produce 
bony anatomy responses than are others 
physically as ill as they are. Rorschach 
movement responses did not reveal signifi- 
cant tendencies for asthmatics to project 
passivity and dependency needs or for colitis 
patients to project strivings and aggressive 
drives in their fantasy productions. No evi- 
dence was derived from word associations or 
Rorschach content to indicate that asthmatics 
are inordinately concerned with oral satisfac- 
tions and deprivations or that colitis suffer- 
ers are inordinately preoccupied with excre- 
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tory functions and their indirect, symbolic 
counterparts. Neither the sex sequence of 
the two human figure drawings nor their rela- 
tive size offered any evidence of differences 
in the psychosexual identifications of women 
with asthma and women with ulcerative co- 
litis. The consistently negative trend of the 
findings and implications for theory and re- 
search were discussed. 


Received September 18, 1954. 
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Parental Figures in Sentence Completion Test, 
in TAT, and in Therapeutic Interviews 


Mortimer M. Meyer and Ruth S. Tolman 
Mental Hygiene Clinic, VA Regional Office, Los Angeles 


A follow-up study to that reported in 
“Correspondence Between Attitudes and Im- 
ages of Parental Figures in TAT Stories and 
in Therapeutic Interviews” (1) was carried 
out with the Forer Sentence Completion Test. 

Twenty of the 50 patients used in the TAT 
study had taken the Sentence Completion 
Test. The 11 items (9, 16, 33, 35, 60, 70, 76, 
88, 94, 96, 99) referring to “mother” and 
“father” were examined, and parental atti- 
tudes and images discovered there were indi- 
cated on the same check sheets used in the 
study of the TAT and of therapeutic inter- 
views. 

Analysis of the data by the same method 
gave the following results: 

1. When the Sentence Completion Test and 
the TAT were compared in regard to fathers’ 
attitudes, six of the 20 had one item in com- 
mon and one had two in common. Of the 
seven records, four could be’ interpreted as 
challenging the null hypothesis at a level of 
confidence between 1 and 10 per cent. 

2. When the Sentence Completion Test and 
therapeutic interviews were compared in re- 
gard to fathers’ attitudes, only five had any 
item in common and none more than one. Of 
these five only three could be interpreted as 
challenging the null hypothesis at a level of 
confidence between 5 and 10 per cent. 

3. When the Sentence Completion Test and 
the TAT were compared in regard to moth- 


ers’ attitudes, only two had any item in com- 
mon, neither more than one, and neither 
could be interpreted as significant. 

4. When the Sentence Completion Test and 
therapy sessions were compared in regard to 
mothers’ attitudes, five had one item in com- 
mon, none more than one, and only one chal- 
lenged the null hypothesis at a level of con- 
fidence between 5 and 10 per cent. 

5. In regard to fathers’ images, compari- 
son of the Sentence Completion Test with the 
TAT and with therapy sessions shows only 
one item in common in each comparison. 
Neither is significant. 

6. In regard to mothers’ images, compari- 
son of the Sentence Completion Test with the 
TAT and with therapy sessions shows two 
items in common in each comparison. None 
of these is significant. 

These results suggest that the kinds of atti- 
tudes and images of parents which are given 
in the Sentence Completion Test furnish no 
basis of prediction of those found in either 
TAT stories or in therapy sessions. 


Received August 19, 1954. 
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Some Standardized Scales for Disorganization 
in Schizophrenic Thinking’ 


William A. Hunt 


Northwestern University 


and Franklyn N. Arnhoff 


University of Nebraska College of Medicine 


In descriptive psychopathology we often 
teach by illustration. Thus, in speaking of 
confusion in schizophrenic thinking, we 
might illustrate it by quoting at random sev- 
eral examples of schizophrenic responses to 
vocabulary items on a standard vocabulary 
test. Such clinical materials are rarely pre- 
sented in the order of the severity of pa- 
thology exhibited, but usually are put before 
the student in random, unorganized fashion. 
It would be more meaningful, pedagogically, 
if they could be presented in some kind of 
orderly scale. The difficulty of presenting 
clinical material in this fashion is that we 
have no “standardized” materials from which 
to draw. Pathological material is rarely or- 
dered or scaled in this fashion, perhaps, be- 
cause of the assumed unreliability of the 
clinical judgments which must be the basis 
for such a scaling approach. 

In a previous study (1) in which 22 ex- 
perienced clinical psychologists rated, on an 
11-point scale for disorganization of think- 
ing, standard vocabulary item responses made 
by schizophrenic patients, we found indica- 
tions that there might be sufficient agree- 
ment among clinicians on such items for scal- 
ing purposes. We, therefore, decided to try 
constructing such scales, limiting ourselves to 
pathology as manifested on verbal test re- 
sponses because of the ready availability of 
the material and the ease with which it can 


1 This study is part of a larger project continuing 
under ONR contract 7 onr—450(11) with Northwest- 
ern University. The opinions expressed, however, are 
those of the individual authors and do not represent 
the opinions or policy of the Naval service. 
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be reproduced without photographic or pho- 
nographic recording. 


Procedure 

In the previous study (1), 222 schizo- 
phrenic responses to Wechsler-Bellevue and 
Terman-Binet Vocabulary items, judged by 
three trained clinicians as covering all pos- 
sible values of confusion in thinking in such 
responses, were rated by 22 clinical psycholo- 
gists of at least four years’ professional ex- 
perience using an 11-point scale for degree of 
disorganization in thinking. On the basis of 
the means and standard deviations of the 
raters’ judgments for these items, we selected 
50 of them for the present study. These items 
were evenly distributed over the 11-point 
scale used and seemed to represent an ade- 
quate sampling of the original universe of 
items. 

In an attempt to extend the range of our 
materials we also selected 50 responses to 
Comprehension items on the Wechsler-Belle- 
vue Intelligence Scale. These were chosen 
from the same original test protocols from 
which the vocabulary items were drawn, and 
were judged as being representative of the 
complete range of pathology exhibited. Other 
types of test materials such as Information 
and Similarities were considered for use but 
were discarded after inspection of a large 
number of such responses revealed that they 
did not offer a suitable range of pathology for 
the construction of scales with more than a 
mirimum of scale points. 

For the present study, it was felt more de- 
sirable to have the judges rate on a 7- rather 
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than an ii-point scale. Furthermore, the in- 
structions for the subjects in the present 
study were modified, as the previously used 
instructions were felt to be too vague. Indi- 
cations from the data and discussions with 
the subjects of our first study indicated that 
other criteria than mere severity of pathology 
were being used. Prognosis, therapeutic indi- 
cations, etc. were all being used, confounding 
the variable under study and lowering the 
reliability of the judgments. We, therefore, 
wrote very careful, explicit instructions, de- 
fining the task requested of the judges. This 
point is of considerable importance because 
of the reliability of the results of the present 
study, which is much higher than that previ- 
ously discovered in our own work (1, 4) and 
that of others (2, 6). The instructions fol- 
low: 


We are going to present you with a number of 
schizophrenic test responses to items taken from the 
Wechsler-Bellevue and Terman intelligence scales. 
One of the ways in which the pathology of schizo- 
phrenia may express itself is through disordered 
thinking which results in atypical, deviant, or “ab- 
normal” responses to the items on such a test. The 
qualitative interpretation by the clinician of such 
test responses is one of the bases upon which he may 
base a clinical or diagnostic interpretation. All of 
these responses were given by schizophrenics. None 
of the cases were complicated by mental deficiency 
or known organic conditions. The extent of the pa- 
thology exhibited however is not uniform. In some 
of the responses it is minimal and in others it is 
extreme. 

We are going to ask you to rate these responses on 
a 7-point scale according to the severity of the pa- 
thology exhibited in the response, with the low end 
of the scale representing minimal pathology and the 
high end of the scale representing maximal pa- 
thology. In making these ratings we are asking you 
to concentrate upon the severity of the pathology 
exhibited in the response itself. We are not inter- 
ested in such things as prognosis, chronicity, thera- 
peutic indications, etc. In essence, what we are ask- 
ing you to do is to judge how “schizophrenic” each 
response is. 

Our basic assumption is that the nature of such 
responses is one of the dimensions upon which a 
diagnosis of schizophrenia is made. We are assum- 
ing that some kind of a continuum exists and that 
these items could be placed upon it, ie., the pa- 
thology exhibited in some of these responses may be 
such that in and of itself it would not lead you 
to think that the persons responding were schizo- 
phrenic. Others of them might cause you to suspect 
schizophrenia. In still others, the pathology ex- 
hibited may be so extreme as to make you reason- 
ably sure that the patient is schizophrenic. We re- 
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alize that no actual diagnosis would ever be made 
on the basis of such responses alone. In this sense 
our situation might be called somewhat artificial. We 
are not asking you to make a diagnosis of schizo- 
phrenia on a patient, however. We are merely ask- 
ing you to rate how schizophrenic each of these re- 
sponses is. 

We are doing this because we wish to see whether 
or not there will be sufficient agreement among cli- 
nicians so that we may establish scales representing 
the typical responses of schizophrenics. The obvious 
analogy would be with a handwriting scale or a 
colorimeter in chemistry. 

Our hope is that even if such scales would not be 
applicable in the diagnostic situation as an aid in 
evaluating the nature of actual test responses or 
making a clinical interpretation of actual test re- 
sults, they nevertheless would be very helpful for 
teaching purposes. Thus, in introducing students to 
the pathology exhibited in schizophrenic thinking, 
instead of introducing random illustrations of pe- 
culiar responses given by schizophrenics, it would be 
of real value to present them with an orderly series 
of test responses, scaled on a 7-point continuum 
based on the empirically ascertained agreement of 
a group of trained clinicians. Our goal is to find 
whether or not such scales are possible and, if so, to 
construct them. Your judgments on this experiment 
will be the basis for our work. We thank you for 
your cooperation and hope that you will agree with 
us that such work is, valuable in attaining the long 
range goals of our mutual profession. 


The 50 Vocabulary responses and the 50 
Comprehension responses were then presented 
to 16 clinical psychologists working in the 
Chicago area. Each one possessed the Ph.D. 
in clinical psychology and a minimum of four 
years on-the-job clinical experience. The data 
were mimeographed and presented so that Vo- 
cabulary items were rated first and Compre- 
hension items second, with the same sheet of 
instructions serving for both. Due to the rela- 
tively small NV, counterbalancing the order of 
presentation was not attempted. 

In order to check the reliability of our 
judges and materials, the task was repeated 
three months later. The subjects were initially 
unaware of our intent to request a retest. The 
group repeat reliability was .97 for Vocabu- 
lary and .96 for Comprehension, using a 
Pearson r. The individual test-retest coeffi- 
cients using the Pearson r for the individual 
judges ranged from .65 to .92 for Vocabulary 
and from .68 to .90 for Comprehension. The 
agreement among the clinicians was measured 
by correlating each individual’s judgments on 
each group of 50 items with the average of 
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the group of judges, using a Pearson r. These 
ranged from .73 to .92 for Vocabulary and 
from .64 to .88 for Comprehension on the 
first presentation, and from .69 to .92 for Vo- 
cabulary and from .66 to .86 for Compre- 
hension on the repeat performance. These 
reliabilities were considered sufficiently high 


Table 1 


Vocabulary Scale 











Scale 
Point Response Mean SD 
1 Gamble—Totakeachance,arisk 1.00 0.00 
Seclude—To go away and be 
alone, to seclude oneself 1.50 0.63 
Donkey—A type of four-legged 
animal 1.50 0.52 
2 Gown—Garment you wear for 
lounging 1.75 0.93 
Shrewd—Careful in a sneaky, 
clever way 2.19 0.75 
Nail—A bit of metal used to 
pound on 2.37 0.81 
3  Plural—Means plus another 2.94 0.93 
Join—Has to do with organization 2.62 0.96 
Peculiarity—Action one doesn’t 
usually engage in 3.00 1.15 
4 Milksop—A sympathetic listener, 
but lacking in understanding 4.19 1.17 
Espionage—Crooked, not truthful 4.12 1.09 
Seclude—To put somewhere in 
the dark 3.81 1.11 
5 Armory—Combined form of some 
sort of organization 494 0.77 
Juggler—Acts in front of a person, 
respects himself as a juggler 5.00 0.82 
Espionage—A type of sinful dev- 
ilment 5.44 0.89 
6 Nail—Metal I guess, let’s say a 
metal which is made scientifi- 
cally for purpose of good and 
bad use 5.75 0.93 
Armory—Part of army subject to 
call without banner 5.94 0.85 
Diamond—A piece of glass made 
from roses 644 0.63 
7 Cushion—To sleep on a pillow of 
God’s sheep 6.75 0.45 
Fable—Trade good sheep to hide 
in the beginning 6.81 0.40 
Guillotine—Part of law subject 
only to those without call to 
stay on earth 6.62 0.62 


wn 


6 


~T 


Theater 


Table 2 


Comprehension Scale 


Response 


Envelope—Deposit it in the 
box 

Taxes—Taxes are necessary to 
support the government 


mail 


Land in the city—Because they 
got more accommodations in the 
city than in the country 

Envelope—Best thing is to bring 
it to post office 

Envelope—Pass it by or mail it 

Turn in an alarm so that 

everyone wouldn’t get 

up 


purned 


Marriage Proof and identifica 
tion so you wouldn’t get some 
one else’s wife 

Shoes—Probably 


Dutch use wood 


just tradition, 


— 


.aws—It is reasonable for a group 
of people to come to some agree- 
ment and acceptance of a com 
mon good and to aid what has 
proven to be the best for the 
many; that is they are made to 
prevent illegal activities 

Shoes—Because leather has un- 
doubtedly proved to be the most 
durable of all that which has 
been utilized for the preserva- 
tion of the feet and to continue 
the comfort of those, that is the 
people who have chosen to wear 
shoes 


Marriage—For you 
might say and to take care of 
each other according to health 

Marriage—Some people get mar- 


ownership 


ried in church and some people 
get married outside of church 


Forest—I’m not good at telling 
directions. Just uphill 
and when you get to the top it 
is easier going down 

Marriage—For scientific purposes 
and for the identification of sib 
lings, siblings of the association 
of the parents 


walk 





Mean 


1.06 


1.06 


2.00 


2.00 


3.00 


450 


3.69 


5 06 


5.69 


6.31 


6.31 


0.82 


1.03 


0.96 


0.96 


1.09 


0.71 


1.01 








174 


to justify construction of rough scales from 
the obtained data. 

The items for the scales were selected by 
choosing those responses whose means were 
closest to the desired scale point and whose 
standard deviations were the lowest available. 
Differences between the mean rating for each 
item on the test-retest were tested using Stu- 
dent’s ¢ for correlated data (3, p. 277), and 
for variances using the formula given by Mc- 
Hugh (5) for comparing two correlated sam- 
ple variances. The few items which showed 
significant differences were discarded. Three 
scales for Vocabulary and two for Compre- 
hension were selected, based on the above- 
mentioned criteria for item selection. The rat- 
ings on the first testing are used as it is felt 
they are more representative of the “true” 
value, ruling out the effects of practice or 
memory, negligible though they were. These 
are presented in Table 1 where all three Vo- 
cabulary items for each scale point are listed 
together, as are the two items at each scale 
point for Comprehension in Table 2. 


Discussion 


It will be noticed that in the case of Com- 
prehension there is some indication of a short- 
ening of the scale at the upper extreme. 
Apparently, our severe Comprehension items 
were regarded as somewhat less extreme than 
were the Vocabulary items. The reasons for 
this are unknown. Our figures on the reli- 
ability of clinicians’ judgments with this type 
of material are higher than those we reported 
previously (4). This may be attributable to 
sampling, but we feel that the modification 
and clarification of our instructions may be 
a prime factor influencing this change. As one 
of us has said previously, “When dealing with 
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experts in a judgmental situation, the task 
should be well defined and the criteria set 
forth clearly. Otherwise the riches of knowl- 
edge may yield confusion rather than clarity” 
(1, p. 274). 

The scales here presented are admittedly 
rough. No attention has been paid to such 
theoretical problems as equality of interval, 
etc., and a cross validation of the data is 
necessary. In their present stage, however, we 
feel they are valuable as pedagogical aids. 
Future development of such scales might 
make it possible to introduce quantitative 
scoring for the qualitative cues that clinicians 
presently derive from test materials. This 
type of material might also be used as a 
“test” of clinical ability if the materials were 
given to clinicians, and they were asked to 
rate them. Individual agreement with the 
group consensus then might be a measure of 
clinical aptitude, or at least of an approach 
to accepted current clinical interpretation. 


Received October 19, 1954. 


References 


1. Arnhoff, F. N. Some factors influencing the unre- 
liability of clinical judgments. J. clin. Psychol., 
1954, 10, 272-275. 

2. Asch, P. The reliability of psychiatric diagnoses. 
J. abnorm. soc. Psychol., 1949, 44, 272-276. 

3. Edwards, A. L. Experimental design in psycho- 
logical research. New York: Rinehart, 1950. 

4. Hunt, W. A., Arnhoff, F. N., & Cotton, J. W. 
Reliability, chance, and fantasy in inter-judge 
agreement among clinicians. J. clin. Psychol., 
1954, 10, 294-296. 

5. McHugh, R. B. The comparison of two corre- 
lated sample variances. Amer. J. Psychol, 
1953, 66, 314-315. 

6. Mehlman, G. The reliability of psychiatric diag- 
noses. J. abnorm. soc. Psychol., 1952, 47, 577- 
578. 


Journal of Consulting Psychology 
Vol. 19, No. 3, 1955 


Differential Prediction of a Specific Behavior 


from 


Three Projective Techniques 


Claire M. Vernier, 
VA Center, Martinsburg, West Virginia 


J. Frank Whiting, 
VA Hospital, Rutland Heights, Massachusetts 


and Malcolm L. Meltzer 


Catholic University of America 


In recent years, there has been an increas- 
ing interest on the part of psychologists in 
the use of projective techniques for prediction 
of behavior which has important social con- 
sequences. Such study not only has the value 
of contributing crucial data necessary for an 
integrated theory of behavior in general, and 
projective techniques in particular, but also 
enables psychology to fulfill its professional 
function of aiding the culture in maintaining 
and helping its emotionally disturbed mem- 
bers. 

The area of behavior with important social 
consequences which was chosen for the cur- 
rent study was the great number of tubercu- 
lous patients who leave hospitals and sana- 
toriums against medical advice (subsequently 
referred to as AMA bebavior). Since the over- 
whelming majority o! these patients have 
active, communicable disease, the social con- 
sequences of their leaving the hospital are 
infinitely more serious than would be the re- 
sults of similar actions on the part of neu- 
rotics or persons suffering from noninfectious 
physical diseases. These latter individuals pre- 
sent difficulties to themselves alone or at the 
most to their immediate family and friends. 
The actively tuberculous patient is a poten- 
tial source of infection to the entire commu- 
nity as well as an obvious danger to him- 


1 Formerly Clinical Psychologist, VA Center, Mar- 
tinsburg, West Virginia. 
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self in having interrupted or terminated his 
treatment. 


A review of the literature on tuberculosis reveals 
several important facts about this problem of AMA 
behavior. First, it is a world-wide problem and not 
merely limited to certain hospitals with inadequate 
facilities. Second, the consensus in the literature is 
that such action stems from pressures from three 
sources: (a) the person’s own needs, (6) the influ- 
ence of friends and relatives outside the hospital, and 
(c) pressures put upon the person by the hospital 
environment. Third, while there has been much 
speculation concerning personality in the tubercu- 
lous, including efforts to define a tuberculous person- 
ality or character type, there has been little in the 
way of controlled experimental investigation about 
any phase of the psychological concomitants of tu- 
berculosis. Only Gurel and Jennings (5), to our 
knowledge, have attempted any integration of theory 
with empirical investigation of this problem. Essen- 
tially the theoretical basis of their study was one of 
a temporal relationship between the operation of re- 
ward and punishment and the acquisition and reten- 
tion of behavior (9). Thus, in terms of tuberculous 
patients, those who exhibit AMA behavior demon- 
strate an inability to inhibit impulsive behavior in 
order to derive long-term satisfaction, while patients 
who remained in the hospital and were discharged 
with maximum hospital benefit (subsequently re- 
ferred to as MHB behavior) were able to do so. 
More general consideration of the implications of 
this theory is beyond the scope of the present paper. 
However, the above hypothesis concerning the proc- 
ess determining patients’ behavior is the general hy- 
pothesis, at the level of behavior theory, on which 
the present study is based. 


The particular projective techniques uti- 
lized in this study were the Rorschach test, 


uw 
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the Thematic Apperception Test, and House- 
Tree-Person Test. The particular hypotheses 
tested experimentally will be described in the 
following sections. It suffices here to point out 
that these three tests constitute samples of 
projective techniques which vary both in 
terms of instructions and degree of structure 
of overt stimulus material presented to the 
subject. 


Subjects and Method 


Since January of 1953 a group psychologi- 
cal testing program has been carried out for 
all admissions to the TB Service as soon as 
they have been cleared by the ward physicians 
as physically able to take the tests. The ma- 
jor purpose of the testing program is to pro- 
vide significant information to the physicians 
in regard to the patients’ attitudes toward 
their illness, hospitalization, and treatment, 
and to identify emotional problems which 
could interfere with hospital adjustment or 
response to treatment. However, the test pro- 
tocols are also available for research purposes. 
As of January, 1954, approximately 350 pa- 
tients had been tested. Of this group, 63 had 
received MHB discharges and 95 had re- 
ceived AMA discharges. 

From these two samples of discharged pa- 
tients, all cases were chosen who had been 
given any one of the three tests selected for 
analysis. Inasmuch as the particular battery 
of psychological tests used for the admission 
testing program varied according to the pref- 
erences of the staff psychologist assigned to 
the TB Service, the populations for study of 
the three tests overlap but are not identical. 
Description of the specific subjects used for 
each test is included in later sections. 


Analysis of the Tests 
Rorschach 


The hypotheses tested in this study by 
means of the Rorschach were derived from a 
series of deductions based on the writers’ gen- 
eral hypothesis at the level of behavior, their 
assumptions concerning the projective tech- 
nique situation, and the generally accepted 
clinical interpretations of scoring categories 
of the Rorschach responses. 

Since the basic theory being tested was that 
AMA patients are significantly less able to 
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inhibit impulsive behavior in order to obtain 
long-term satisfaction than are MHB pa- 
tients, only the determinant categories of the 
Rorschach were chosen for analysis. Accord- 
ing to Rorschach literature (6), the deter- 
minants reflect the emotional aspects of per- 
sonality and serve as a principal basis for 
predicting overt behavior. 

The following hypotheses in regard to the 
determinants were formulated: 


a. AMA patients will be higher than MHB pa- 
tients in: F—, F— %, CF, C, =C, FM. 

b. MHB patients will be higher than AMA pa- 
tients in: F, F%, FC, M, C', c, k, K. 

c. Within the AMA group, CF +C should exceed 
FC, FM +m should exceed M, and FM + m should 
exceed c+ C’. 

d. Within the MHB group FC should exceed CF 
+C, M should exceed FM +m, and c+C’ should 
exceed FM + m. 


Subjects. Rorschachs were selected for 46 
male AMA patients and 46 male MHB pa- 
tients so that the two groups were comparable 
for age, race, marital status, education, de- 
gree of disease, and type of treatment pre- 
scribed. Respective mean ages were 39.65 and 
39.67. All tests for these 92 patients, identi- 
fied only by a number to avoid possible bias, 
were rescored by one of the authors to main- 
tain consistency. 

Data analysis. The first score to be ex- 
amined was the mean number of responses 
for each group (R). Had there occurred a 
significant difference between the number of 
responses for the two groups, raw determinant 
scores could not have been used and the effect 
of R on each of the several determinant cate- 
gories would have had to be controlled sta- 
tistically. However, mean R for the AMA and 
MHB groups was 14.5 and 14.3, respectively, 
and the chances were 99 out of 100 that such 
a difference would arise by chance alone. 

Two of the determinants, C and K, did not 
occur sufficiently often for the purposes of 
testing the empirical hypotheses by means of 
an analysis of differences between the AMA 
and MHB groups. Such results are in line 
with previous Rorschach experience. 

The remainder of the determinants ana- 
lyzed produced sufficient data for the pur- 
poses of testing the hypotheses formulated. 
Since the empirical hypotheses were stated in 
terms of a specific direction of the results, a 
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Table 1 


Comparison of AMA and MHB Groups on 
Rorschach Determinants 

















(N = 92) 
Mean SD 
Determi- ——— - —- 
nant AMA MHB AMA MHB CR p 
AMA predicted 

greater 
F-— 19 18 24 24 0.2 ~~ AS 
F-% 33.0 25.0 25.0 180 18 .04* 
CF 18 14 aa. 2 10 .16 
=C 23 30 30 25 
FM 15... 28 a3. 88. A8 16 
m a7 .. OF eB 13 — = 

MHB predicted 

greater 
F 50 49 64 59 O01 A7 
F% 450 480 23.0 220 06. .27 
FC 0.6 10 1.1 15... 413: «i? 
M 09 13 So aw *, a 
hy 0.7 1.2 it itis elle ~ Me 
c >” is 260 2. @s Si 
k 0.7 0.5 2: 12:2 18 





* Such differences could occur by chance 10 or less times in 
100 instances. 


one-tailed ¢ test was used to determine prob- 
abilities (4). The results are listed in Table 1 
and Table 2. 

In making a series of tests of empirical hy- 
potheses one can always expect that a cer- 
tain number of significant differences can 
arise due to chance relationships in sampling. 
Therefore, a chi-square test was made be- 
tween the observed number of significant re- 
sults and near significant results (p < .10) 
and the expected number of such results. 
Seventeen tests were performed and thus 1.7 
number of statistically reliable differences 
would be expected by chance alone. Eight 
significant results were actually obtained. The 
chi-square test of the difference between the 
observed and expected frequencies yields a 
value of 51.2. With 1 degree of freedom the 
chi-square value must be greater than 6.635 
in order to be beyond the .01 level of prob- 
ability. Therefore, it is concluded that the ob- 
served number of statistically reliable differ- 
ences could not have developed from chance 
alone. 

Conclusions. One-half of the predicted Ror- 
schach scores and ratios, assumed to be in- 
dicative of ability to control behavior, were 


~ 
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found to differentiate between the two groups 
at better than chance level. These findings 
would support the general hypotheses postu- 
lated; however, none of the obtained differ- 
ences for the individual Rorschach determi- 
nant categories are significant at a level which 
would yield accurate prediction for the indi- 
vidual case. 


Thematic Apperception Test 


The second approach to the problem of the 
prediction of AMA behavior of tuberculous 
patients was the utilization of a projective 
technique designed primarily to elicit under- 
lying personality dynamics. It was hypothe- 
sized that content analysis of the Thematic 
Apperception Test would reveal certain needs 
operating which were common to the AMA 
group but absent in MHB patients. Should 
this be true, an explanation of AMA behavior 
could be made, which in turn would allow for 
prediction and for institution of preventive 
measures. 

Consideration of the nature of the disease 
and of treatment methods suggests that two 
important problem areas confronting the TB 
patient are those of aggression and depend- 
ence. The patients who are able to remain 
hospitalized for lengthy periods of time and 
undergo painful treatment may display strong 
needs for punishment and for submission and 
dependence. The AMA patients, on the other 
hand, seem to show an inability to accept 
either the loss of autonomy or the enforced 
dependence necessary for treatment, and by 
their antisocial behavior seem to reflect needs 


Table 2 
Comparison of AMA and MHB Groups on 
Rorschach Ratios 
(Predicted AMA group should exceed MHB group 
on each.) 


Percentage of 








subjects 
Ratio AMA MHB CR p 
FM+m>c+C’ 70 57 1.3 10* 
FM+m>M 67 52 1.5 07* 
=C>M 70 50 2.0 03* 
CF+C>FC 57 39 1.8 04* 
* Such differences could occur by chance 10 or less times in 


100 instances. 
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to direct aggression outwardly. It was there- 
fore hypothesized that the AMA group would 
score high on needs for externalized ag- 
gression, dominance, and independence. The 
MHB group was expected to display the 
converse. 

One further analysis of the TAT records 
appeared appropriate for prediction of AMA 
behavior. It was hypothesized that patients 
who leave AMA also are more prone to act 
on their needs than are those patients who re- 
main for the full course of treatment. Action 
was scored from the TAT stories in terms of 
whether the need expressed took the form of 
direct action or whether the need was ex- 
pressed only as fantasy, contemplation, or as 
a possibility in the future. The first instance, 
where the need found expression in action, 
was characterized as the motor level; when 
the particular behavior was only contemplated 
or fantasied it was designated as the premotor 
level. 

Subjects. Twenty white male veterans who 
had been given TAT’s as part of the admis- 
sion group testing program were selected. 
From their history 10 were identified as AMA 
and 10 as MHB. None of the AMA patients 
had been able to remain in the hospital for 
more than 4 months, the mean length of hos- 
pitalization being 2.73 months. All of the 
MHB patients, on the other hand, had been 
in the hospital for at least a year, the mean 
length of hospitalization as of June, 1954, 
being 16.4 months. The mean age of the 
AMA group was 35.1 yrs.; that of the MHB’s, 
33.4 yrs. The two groups were also equated 
as to economic level, pension status, marital 
status, stage of disease, and type of treat- 
ment. 

Data analysis. Four cards of the TAT had 
been regularly given to each of the twenty 
subjects. These were numbers 1, 4, 6BM, and 
7BM. The TAT records were scored by the 
need-press system devised by Aron (2). A 
ratio was computed for each patient by total- 
ing the number of needs for external aggres- 
sion, dominance, and independence (EDI) 
and relating this to the total number of needs 
for inward aggression, submission, and de- 
pendence (ISD). The data were placed in a 
fourfold table and analyzed by the chi-square 
technique. 
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Table 3 
Comparison of AMA and MHB Groups on the TAT 
(N = 20) 
Condition x? p 
EDI vs. ISD with AMA and MHB 
combined 0.80 .30 
AMA vs. MHB on EDI and ISD 0.83 30 
Motor vs. Premotor with AMA and 
MHB combined 0.20 50 
AMA vs. MHB on Motor and Pre- 
motor Responses 0.21 50 





Each need was scored also on the motor- 
premotor dichotomy. These data were placed 
in a fourfold table and subjected to a chi- 
square analysis. 

It was determined first that the occurrence 
of EDI responses does not differ significantly 
from the number of ISD responses with the 
samples combined. Should one of the variables 
predominate in all the TB subjects, it would 
be more difficult to discover a significant dif- 
ference between the subgroups. However, as 
can be seen in the first result of Table 3, 
neither EDI nor ISD responses are signifi- 
cantly more plentiful when the two groups 
are combined. Chi square is 0.80 and this fails 
to reach even the .30 level of significance. 

The second result in Table 3 relates more 
directly to the hypothesis that AMA patients 
display strong needs for external aggression, 
dominance, and independence, while MHB 
patients have greater needs for internal ag- 
gression, submission, and dependence. After 
characterizing each subject as EDI or ISD 
and comparing the number of such individu- 
als in each group, we find that chi square is 
only 0.83. This does not reach the .30 level 
of significance and therefore fails to confirm 
our major hypothesis concerning the dynamics 
operating in tuberculous patients who leave 
the hospital against medical advice. 

Our last analysis concerned the level of re- 
sponse, i.e., whether the need was expressed 
as definitely carried out in motor behavior, 
or whether it was presented in the form of a 
wish, fantasy, or intention and therefore was 
premotor. The third result in Table 3 shows 
that the number of needs scored as at the 
motor level did not differ significantly from 
the number of needs scored at the premotor 
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level with the AMA and MHB samples com- 
bined. Chi square is .20 which is not signifi- 
cant at the .50 level of probability. Once 
again, if significant differences do exist be- 
tween the groups, they should not be ob- 
scured. 

However, the AMA and MHB groups are 
not differentiated by the level of response as 
can be seen in the final result of Table 3. 
Chi square is only .20, and, therefore, there 
are 50 chances out of a hundred that the dif- 
ference is a chance one alone. From this sam- 
ple then, a preponderance of responses at the 
motor level does not of itself indicate a pro- 
pensity for AMA behavior, since responses at 
this level are fairly evenly divided between 
both groups. 

Conclusions. The failure of the TAT to ef- 
fectively discriminate between the AMA and 
MHB patients can be viewed from several 
aspects. Specifically, it is quite possible that 
our particular hypotheses concerning the dy- 
namics of tuberculous patients are in error. 
Furthermore, it can be questioned whether 
any definite constellations of needs or dynam- 
ics are operating in tuberculous patients in 
general or AMA patients in particular. It has 
long been an axiom of the psychodynamic ap- 
proach that the same bit of behavior often re- 
flects different needs in different individuals 
or even distinct needs in the same person at 
separate times. Should this be so in relation 
to the AMA problem, the use of tests re- 
vealing dynamic needs and content of wishes 
and strivings becomes suspect as an ade- 
quate tool for the prediction of this particular 
bit of behavior. 

One exception to the last statement might 
be pointed out. As yet little is known about 
content-type tests which present the subject 
with a stimulus which symbolizes the par- 
ticular situation in which the prediction of 
overt behavior is to be made. What is being 
considered by the authors is a modification of 
the TAT which would depict the particular 
problem areas and situations facing the TB 
patient. Perhaps such a focusing could elicit 
needs and unconscious attitudes more ger- 
mane to the overt behavior to be predicted 
and some communality could be discovered. 
This particular problem awaits further in- 
vestigation. 


House-Tree-Person Test 


Inclusion of a drawing test in the battery 
of techniques selected for analysis was based 
on the hypothesis that nonverbal tests tend 
to yield data less subject to consciously con- 
trolled evasion and censorship and thus pro- 
vide a more efficient basis for prediction of 
impulsive, overt behavior (11). The HTP 
test was chosen since the house and the per- 
son constitute stimuli approximating two of 
the focal elements in the problem behavior 
being studied, namely institutional versus 
home living, and the patient’s perception of 
his own goals and needs. 

Contrary to the usual assumptions concern- 
ing the house drawing as reflecting attitudes 
toward home and family, the authors assumed 
that the house drawn by the patient would 
symbolize aspects of his concept of and atti- 
tudes toward his important living space (3). 
The following specific hypothesis was made: 
AMA and MHB patients will differ signifi- 
cantly in the treatment of all parts of the 
House drawing which pertain to an interac- 
tion of outside and inside. The following 13 
test characteristics were thus selected for dif- 
ferential analysis: 


. door placement 

. window details 

. details on door 

. number of windows 

. chimney treatment 

. inclusion of walk 

. inclusion of steps 

. inclusion of fence 

. inclusion of road 

. inclusion of garage 

. inclusion of shrubbery and trees 

. direction of house facing (e.g. differential place- 
ment of small side) 

13. ground line drawn 
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Three aspects of the person drawing were chosen 
for analysis: direction the person was facing, type 
of neck treatment, and presence or absence of hands 
Underlying the use of these three variables were the 
current theories of drawing interpretation: (a) fac- 
ing to the right hand side of the paper represents 
externalization of action; (b) the neck depicts the 
type of controls over body impulses; (c) treatment 
of hands depicts aspects of interpersonal relation- 


ships (8, 11). 


Subjects. Thirty male MHB patients and 
50 male AMA patients were selected. Mean 
ages for the two groups were 36.2 and 34.7, 
respectively, with ranges of 18-65 and 21-63. 





180 


Approximately 85% of each group were white 
(83% and 86%, respectively); the remainder 
were Negro. Total length of hospitalization 
for the MHB group ranged from 2 to 47 
months, with a mean of 11.0 months; for the 
AMA group the mean was 3.8 months, with 
a range of 1 to 9 months. The two groups 
were equated for education, occupational 
status, marital status, degree of disease, and 
type of treatment. 

Data analysis. Incidence of occurrence was 
tabulated for each of the 16 test variables 
selected and the differences in percentages 
computed. Results are given in Table 4. 

Of the 16 variables tabulated, two did not 
occur sufficiently frequently in either group 
to permit further analysis. Of the remaining 
14, seven yielded differences between the two 
that could be expected to occur by chance 


Table 4 


Comparison of AMA and MHB Group on the HTP 
(N = 80) 








Percentage of 








subjects 
Item AMA MHB CR p 
House 

Door on left side of 

page only 6 27 2.3 .02* 
Door details, 2 or 

more present 24 30 0.6 55 
Window detail 

drawn as + 62 23 3.9 .0001* 
No. of windows, 

4 or more 32 53 1.8 08 
Smoke from chim- 

ney 26 0 4.3 .0001* 
Walk present 20 10 1.2 .21 
Steps drawn 18 27 1.0 32 
Road indicated 12 0 2.4 .02* 
Shrubbery and/or 

trees drawn 2 10 2.0 .06 
Small side of house 

on right side of 

page 4 20 2.4 .02* 
Ground line drawn 22 3 2.7 01* 

Person 

Head facing to right 

of page 20 7 1.6 Al 
Neck either omitted 

or drawn wider 

than long 50 37 12 .23 
Hands omitted 44 23 2.1 .02* 





* Such differences could be expected to occur by chance less 
often than three times in 100. 
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less often than five times in 100. The expected 
frequency of results below .05 by chance 
alone would be 0.8. A chi-square test yields 
a value of 40.61, which is well beyond the 
.O1 level of 6.63 for one degree of freedom. 

Conclusions. The general hypothesis was 
confirmed that a nonverbal test which pre- 
sented stimuli to the subjects, certain ele- 
ments of which were directly related to the 
problem area of behavior to be predicted, 
would yield differences between the two 
groups sufficiently reliable to permit predic- 
tion for the individual case. 


Application of Results to a Second 
Sample of Patients 


Patients admitted for the first time to the 
hospital and seen for psychological testing 
during the period of January through March, 
1954, constitute a group available for a par- 
tial approach to cross validation of the find- 
ings. Test reports written by the author 
assigned to the TB Service at that time in- 
cluded a specific prediction of whether the pa- 
tient was a potential AMA risk. Since the 
report also included a description of the fac- 
tors involved for the individual and the im- 
plications for treatment, it is impossible to 
evaluate the influence of the report, the pre- 
diction, and the suggestions upon the subse- 
quent actions of the patient. The psychologist 
responsible for the predictions had knowl- 
edge of the findings from the statistical analy- 
sis of the three tests but did not use any type 
of index or quantified approach. The analysis 
of the accuracy of the predictions is shown 
in Table 5. 

Since all the patients had been in the hos- 
pital less than six months, these data are not 
final figures. It is interesting to note that to 
date all of the errors in prediction have been 
in one direction, namely, patients identified 
as potential AMA risks who still remain in 
the hospital. 


General Discussion of Results 


Implications of findings for the problem of 
AMA behavior prediction. Despite the small 
samples available for analysis, a sufficient 
number of reliable differences in test scores 
were found to permit accurate prediction for 
individual patients. The data for one of the 
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Table 5 


Application of HTP Findings to First Admission 
Patients Over a Three-Month Period 











(N = 26) 
Number of 
patients 
Number of discharged 
Prediction patients AMA* 
An AMA potential 16 12t 
No AMA potential 10 Of 





* As of August 15, 1954. 
t Combined accuracy of prediction equals 85 per cent. 


tests, the drawing of a house, justified the de- 
velopment of a specific index. Use of this pre- 
dictive index with two other groups of TB 
patients will be reported subsequently. 

While the findings support the general hy- 
pothesis that the patients who leave the hos- 
pital AMA show less control over their be- 
havior than do patients who stay, there was 
no confirmation of specific differential dy- 
namics or personality characteristics between 
the two groups. The one concrete conclusion 
which may be drawn is that, regardless of the 
problems and motivations of the individual 
patient, those cases who leave against advice 
project their perception of ease of access and 
exit into their representation of a dwelling. 

Since the study demonstrates that potential 
AMA patients may be identified accurately 
shortly after their first admission to the hos- 
pital, the next problem is that of preventive 
treatment. The major implication of the pres- 
ent findings is that one important factor is 
the perception of the hospital as an empty 
and unsatisfying place which has no bar- 
riers to prevent departure. Thus, one area of 
needed attitude change to prevent AMA dis- 
charge would appear to be the patients’ con- 
ceptions of the hospital itself. 

Implications of results for projective test 
theory. For several decades psychological lit- 
erature has presented discussion of various 
aspects of the problem of prediction of be- 
havior from projective techniques. Much of 
the attention has been focused on difficulties 
in validation of the tests or verification of 
isolated hypotheses about selected test scores 
or characteristics. Within recent years in- 
creasing emphasis has been placed on the use 
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of projective measures in the practical area 
of evaluation of changes occurring during the 
psychotherapy process and the prediction of 
response to psychotherapy (1, 10). 

However, the absence of an integrated, 
comprehensive theoretical basis continues to 
handicap much of the clinical research with 
projective methods. Although there has been 
an increasing tendency to rely on field theory 
as an adequate frame of reference for the or- 
dering of projective behavioral data, rela- 
tively minimal use has been made of this con- 
ceptual framework in experimental studies. 
The authors have derived their basic hypothe- 
sis of the use of interaction measures for be- 
havior prediction from the basic postulate of 
field theory that behavior is a function of the 
person-situation relationship (7). Therefore, 
to predict reliably a specific behavior, the 
psychological instrument used must measure 
the interaction of the person’s needs and the 
situational stimuli. To the extent that a given 
projective test provides such measures, the 
more accurate the prediction of the external- 
ized behavior. 

In selecting the three diverse types of pro- 
jective techniques for analysis in this study, 
the authors had assumed that a differential 
level of accuracy in prediction would result. 
While all three of the tests provide measures 
of individual dynamics, the tests may be ar- 
ranged on a continuum of the amount of 
situational stimuli presented. In terms of the 
very literal description of the behavior to be 
predicted, namely, a person walking out of a 
living place, the tests could be ordered as fol- 
lows: drawing of a tree, TAT, Rorschach, 
drawing of a person, and drawing of a house. 

The findings of the present study support 
this hypothesis. Neither the tree drawings nor 
the TAT yielded any reliable differences be- 
tween the two groups of patients. The Ror 
schach and person drawings gave reliable 
group differences but could not be used for 
clinical prediction. The house drawings were 
successful in prediction for the individual 
patient. 

The major implication of the present study 
for projective methods would appear to be a 
confirmation of the importance of analysis of 
interaction between person and situation for 
accurate behavioral prediction. For such an 
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analysis to be possible, it is essential that the 
tests, while ambiguous, present stimuli which 
tap the specific situation or area in which the 
behavior to be predicted occurs. 


Summary 


Three projective tests were analyzed in 
terms of reliability of differentiation beiween 
two groups of hospitalized TB patients. The 
two groups of patients were comparable for 
age, race, education, marital status, pension 
status, degree of disease, and type of treat- 
ment; they differed only in length of stay in 
the hospital and the type of medical discharge 
received. 

The TAT yielded no significant differences 
between the two groups and none of the spe- 
cific hypotheses were verified. 

Eight of the determinant scores and ratios 
on the Rorschach test differentiated reliably 
between the two groups, but not at a level 
which would permit clinical prediction. While 
the patients differ significantly in degree of 
control over personal impulses, this factor 
does not in itself provide a sufficient explana- 
tion for the behavior under study. 

Drawings of the house provided seven vari- 
ables which differentiated the two groups; 
two of these differences were at a level which 
would permit prediction for an individual pa- 
tient. Subsequent application of these find- 
ings to a second sample of TB patients re- 
sulted in accurate prediction for 85 per cent 
of the cases. 

The differential accuracy in prediction for 
the three projective tests studied would sup- 
port the authors’ basic hypothesis that ac- 


curate prediction of specific overt behavior 
from projective techniques is dependent upon 
the extent to which the test provides a meas- 
ure of the interaction between needs of the 
individual and a symbolization of the ex- 
ternal factors of the situation in which the 
behavior occurs. 


Received September 22, 1954. 
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Numerous studies have attempted to apply 
statistical techniques, such as multiple regres- 
sion and the discriminant function, to the 
problem of the psychiatric validity of the 
Rorschach scoring categories. Typically these 
studies attempt to isolate a subgroup of cate- 
gories or elements from the total profile and 
to estimate a series of weights to be applied 
to these elements which, in linear combina- 
tion, will best discriminate between psychi- 
atric classificatory syndromes. This purely 
psychometric approach, which minimizes the 
role of the clinical interpreter of the Ror- 
schach protocol, can be successful when one 
necessary condition is met, i.e., when Ss alike 
in a selected group of clinical behaviors (psy- 
chiatric syndrome) are alike on a selected 
group of Rorschach profile elements and when 
these Ss differ on the same elements from Ss 
in some other psychiatric category. Failure 
to find diagnostic validity of the Rorschach 
using this approach may be attributable to 
several possible factors. The psychiatric clas- 
sification method may fail to provide inter- 
nally homogeneous and externally heteroge- 
neous categories of behavior, and result in 
considerable behavioral overlap between the 
psychiatrically distinct categories. Or several 
distinct clusters of Rorschach elements may 
occur within a given diagnostic category, but 
these clusters are still different from those 
within another psychiatric category. Or the 
element clustering may cut across diagnostic 
lines with Ss from different categories being 
more alike Rorschachwise than are Ss within 
the same categories. These last two possibili- 


ties are suggested by the study of Wittenborn 
and Weiss (5) who reported an inverted fac- 
tor analysis of the intercorrelations on 55 
symptom-rating scales of 20 Ss with the same 
psychiatric diagnosis (manic-depressive psy- 
chosis—manic state). The absence of a gen- 
eral factor from the six extracted leads these 
authors to conclude that important subgroup 
differences exist within such a homogeneous 
psychiatric group and that these differences 
are obscured by the similar diagnostic label. 
Finally, the Rorschach elements may be in- 
terindependent and no clustering may occur. 

One promising preliminary approach to the 
isolation of clusters of Rorschach elements is 
the inverted factor analytic or Q technique. 
Factors are extracted from a matrix of inter- 
correlations between Ss (rather than from a 
matrix of intercorrelations between variables 
as in the standard R technique) and, after 
the factors are rotated to simple structure, 
the factor loadings used to group the Ss. Cat- 
tell (2, pp. 502-503) regards this method as 
especially useful in isolating subgroups within 
a nonhomogeneous population and says the 
simple structure factors should correspond to 
these population variables when the depend- 
ent variable measures are affected by the 
population variables (2, p. 511). This sug- 
gests that if we selected groups of Ss from 
distinctly different psychiatric classifications, 
selecting Ss for internal behavioral homoge- 
neity and intergroup heterogeneity, and fac- 
tor analyzed the correlations between Ss over 
a large series of Rorschach elements, the fac- 
tor loadings would permit a psychometric 
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grouping of the Ss. If this grouping corre- 
sponded to the psychiatric grouping we would 
have evidence that a psychometric technique 
could profitably be applied to a larger sample 
of Ss. 


Procedure 


Subjects. Sixteen Ss were used with four Ss 
in each of four diagnostic categories. Group 1 
Ss had all been diagnosed as paranoid schizo- 
phrenics with a minimum of disagreement in 
regard to diagnosis. In each case delusions had 
been clearly elicited and there was a history 
of markedly unusual social behavior. Clinical 
material gave evidence of unusual difficulty 
in sexual adjustment and confusion in sexual 
role or homoerotic tendencies. Group 2 Ss had 
all been diagnosed as imbeciles. There was no 
evidence in any case of brain injury or of 
special physiological determinants. Group 3 
had a diagnosis of anxiety state. Clinical ma- 
terial gave evidence of marked current anx- 
iety at the time of testing. In no case was 
there any evidence of schizophrenic person- 
ality features, bizarre thinking, or loss of af- 
fect. History material demonstrated strong 
discomfort and incapacitation from anxiety 
feelings. All four Ss were psychiatric patients. 
Group 4 Ss all had a diagnosis of involutional 
melancholia. Clinical notes (dated shortly be- 
fore and after testing) indicated that each S 
was Clearly depressed at the time. Feelings of 
inadequacy, futility, and guilt were verbalized. 
The gross change in affect was the primary 
feature of the illness. 

Several hundred case records were reviewed 
in selecting Ss. The purpose of the selection 
was to insure that Ss in any one of the four 
groups not only had a similar diagnosis, but 
were also as homogeneous as possible in terms 
of the clinical picture and current status. Con- 
trols reflected this purpose. For example, the 
four Ss in Group 1 were all male; the four in 
Group 4 all female. 

Rorschach, Each protocol was scored three 
times, once by the original examiner and in- 
dependently by two other scorers for pur- 
poses of this study. Totals for variables used 
in the present study were compared for the 
two independent scorings and inspection indi- 
cated no disagreements that would affect the 
later dichotomization of the categories. The 


selection of Ss and scoring was under the di- 
rection of the second author who turned the 
data over to the first author for the statistical 
analysis with all identifying information (such 
as diagnosis, sex, etc.) removed. The first au- 
thor was not aware of the specific diagnostic 
groups involved nor the groupings of the Ss 
until after the factor analysis and subsequent 
rotation of factor axes had been accomplished. 


Results 


The distribution of the 16 patients on each 
of the 42 Rorschach elements was found and 
each element dichotomized as close to a me- 
dian value as was possible. Patients scoring 
above the median on an element were scored 
“plus”: those falling below the median score 
for that element were scored “minus.” Thus 
for each patient we had a series of 42 plus 
and minus ratings. Each patient was corre- 
lated with every other patient by comparing 
their plus and minus ratings on each of the 
42 elements and fourfold tables constructed 
reflecting the correspondence between their 
ratings. Phi correlation coefficients were com- 
puted from each of the 120 fourfold tables 
since the assumption of normality necessary 
for the use of tetrachoric correlations was ob- 
viously not possible for the distributions of 
many of the Rorschach elements.’ 

The matrix of intercorrelations between the 
16 Ss was factor analyzed by the standard 
centroid method, using the largest correlation 
in each column as the communality estimate 
at each stage of the extraction process. The 
process was terminated after the third factor 
had been estimated because of the small re- 
maining residual correlations (the median of 
the absolute magnitudes of the residuals was 
.07). The original three factors were graphi- 
cally and analytically rotated to achieve, as 


1The list of Rorschach elements, ranges of re- 
sponses scored plus, and numbers of patients falling 
above the dichotomization points, and the phi inter- 
correlations between the patients and residual cor- 
relations after extraction of three factors have been 
deposited with the American Documentation Insti- 
tute. Order Document No. 4503 from the ADI Aux- 
iliary Publications Project, Photoduplication Service, 
Library of Congress, Washington 25, D. C., remit- 
ting in advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 
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Table 1 


Factor Loadings (Decimal Points Omitted) of Three Factors Extracted from Phi Intercorrelations of 


Rorschach Profiles of Psychiatric Subjects 








Factor loadings 











Productivity — 
variables Original Rotated 
Psychiatric Patient —— —_—____-— — 
group number P PQ I II III A B ( h 
Neurotic 20 67 22 43 —32 26 54 14 23 35 
(Anxiety) 40 64 23 52 — 54 15 66 —(02 39 58 
40 50 25 —37 — 28 — 34 —12 56 —06 33 
55 40 24 — 38 19 —30 —45 —23 —15 27 
Depressed 4 45 25 455 -24 —14 24 —01 47 28 
23 67 22 —13 —45 17 39 —27 —15 25 
d 50 25 — 39 14 40 00 12 —57 33 
54 62 24 41 —10 —22 08 02 47 23 
Paranoid 3 79 17 12 —09 27 27 15 —06 10 
schizophrenic 22 17 14 —18 64 24 — 39 47 —36 50 
24 62 24 —35 -—34 -—07 09 —-44 —-21 24 
29 36 23 39 16 — 28 —15 15 46 26 
Imbecile 14 33 22 77 10 —08 14 41 65 61 
37 24 18 20 36 36 01 54 —10 30 
53 14 i2 24 51 28 —13 61 —03 40 
63 29 21 —27 —18 47 32 OO —47 33 
Per cent of total variance 15 11 s 10 11 13 34 


far as was possible, orthogonal simple struc- 
ture. Both the original and rotated factors 
tended to be bipolar, eliminating the possi- 
bility of positive manifold, and simple struc- 
ture could not be completely achieved. 
Cattell (1, pp. 321-327) has noted that 
the statistical characteristics of the intercor- 
related variables, such as test difficulty and 
variance, can affect the magnitudes of the in- 
tercorrelations and factor structure. Variables 
of either extremely high or low difficulty level 
show small variance, or are high in “eccen- 
tricity.” This is particularly important when 
point correlations, such as the phi coefficient, 
are used. At the empirical level, Wittenborn 
(4) has shown that many Rorschach element 
scoring categories are highly related to the 
total number of responses given by the S. In 
the present study this background suggests 
that the factor loadings of a given S may be 
a function of the proportion of his elements 
scored “plus” which in turn is a function of 
his total number of Rorschach responses. To 
control this possibility two additional meas- 


ures related to the Rorschach “productivity” 
of each S were computed: the percentage of 
Rorschach elements scored “plus” for the S 
(P), and the “eccentricity” or variance of this 
percentage (PQ where 0 = 100 — P). 

In Table 1 are given the original and ro- 
tated factor loadings for each of the 16 Ss 
and also the communalities (4°) and produc- 
tivity variables (P and PQ) for each S.” 

To test the above noted hypothesis of rela- 
tions between factor loadings and productivity 
variables, product-moment correlations were 
computed between each of the original and 
rotated factors and P and PQ. The obtained 
correlations, along with the intercorrelation 
of P and PQ and the multiple correlation of 
P and PQ with the factor loadings, are given 
in Table 2. Factors II and B were both nega- 
tively correlated with P and PQ, while factor 
A was positively related to P and factor III 


2 The transformation matrix for converting origi- 
nal to rotated factor loadings has been deposited 
with the American Documentation Institute. See 
footnote 1. 





186 A. W. Bendig and Roy M. Hamlin 


Table 2 


Product-Moment Correlations Between Factor 
Loadings and Productivity Variables 














Multiple 
Variables P PQ R 
Factor I 03 —.07 .10 
Il — .79** — .64** .83** 
It —.12 —.52* 55 
A .60* .22 .61* 
B — .58* —.71** Ag 
G .16 .23 .23 
he —.42 14 .59* 
PQ .52* 





* Significant at the .05 level. 
** Significant at the .01 level. 


was negatively associated with PQ. The mul- 
tiple correlations of P and PQ with factors II 
and B were significant at the .01 level (R = 
83 and .75) and with factor A and the com- 
munality variable (#*) at the .05 level (R = 
61 and .59). The multiple correlation of P 
and PQ with factor III (R = .55) was close 
to statistical significance at the .05 level. 
Since 10 of the 21 correlations were signifi- 
cant at the .05 level or better, the hypothesis 
that factor structure was affected by indi- 
vidual differences between Ss in Rorschach 
productivity appears confirmed. 

The original hypothesis of this study was 
that one or more of the factors extracted from 
the intercorrelations of Rorschach profiles of 
the 16 Ss would correspond to their psychi- 
atric classification. To test the validity of the 
loadings in discriminating among the psychi- 
atric groups, curvilinear correlations (eta) 
were computed between the diagnostic group- 
ing of the Ss and nine variables; the six origi- 
nal and rotated factors, the communality 
(#*), and the two productivity variables (P 
and PQ). In addition, the loadings on the six 
factors and the communality were corrected 
for the influence of P and PQ (see Table 2) 
by computing a multiple regression equation 
for predicting loadings from the obtained P 
and PQ values for each S and subtracting 
these predicted loadings from the obtained 
loadings. Validity coefficients (eta) were then 
calculated for these corrected values. The un- 
corrected and corrected eta coefficients can be 
found in Table 3. Both of the productivity 


variables, P and PQ, discriminated among the 
psychiatric groups only at the .10 level (eta 
= .66 and .64). Comparison of the means of 
the four groups indicated that the imbecile 
group was low on both P and PQ with the 
other three groups showing similar means. 
Rotated factor B also discriminated among 
the groups at the .10 level (eta = .63). How- 
ever, the comparison of the group means 
showed the imbecile group to be high on this 
factor (the other groups again showing homo- 
geneous means) and the multiple correlation 
of factor B and the productivity variables (P 
and PQ) to be quite large (R = .75). When 
factor B loadings were corrected for their re- 
lationship to P and PQ, the validity coeffi- 
cient fell to an insignificant value (eta = .35). 
The promising validity of uncorrected factor 
B was apparently attributable solely to its re- 
lation to P and PQ. Six of the seven uncor- 
rected factor validity coefficients decreased in 
magnitude when they were corrected for the 
relation of the factor variables to the produc- 
tivity variables, while the eta for one factor 
variable (rotated factor A) increased after 
correction (from .29 to a still insignificant 
54). 


Discussion 


In general, the results of this application of 
inverted factor analytic methods to the Ror- 
schach profiles of psychiatric groups are nega- 
tive. The intercorrelations between the Ss on 
the 42 Rorschach elements were relatively 


Table 3 


Validity Coefficients (Eta) of Factor Loadings in 
Discriminating Psychiatric Groups 














Corrected 
for 
Variables Uncorrected P and PQ 
Factor I .24 .26 
Il 54 AL 
Itt A3 42 
A .29 54 
B .63* 35 
Cc 15 12 
he A8 35 
P .66* 
PQ 6A* 





* Significant at the .10 level. 
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low and the three factors extracted accounted 
for only 34 per cent of the total variability. 
This is in marked contrast with the results of 
Wittenborn and Weiss (5) where the extrac- 
tion of six factors from the intercorrelation of 
20 Ss on 55 rating scale variables accounted 
for 68 per cent of the total variance. This low 
factor communality militates against a clear 
factor structure and consequently against an 
unequivocal grouping of the Ss. Many influ- 
ences may have lowered the communality: 
low Rorschach element reliability, the type of 
correlation coefficient used, etc. 

The obtained factor structure showed no 
evidence of validity against the psychiatric 
category criterion. Only the productivity vari- 
ables, based on the number of Rorschach re- 
sponses, demonstrated any relation to the 
psychiatric grouping and this only because of 
the inclusion of an imbecile group of Ss. As 
found by Wittenborn (4), the obtained fac- 
tor structure was related to Ss’ productivity 
and when this relation was statistically elimi- 
nated the factor loadings showed no sig- 
nificant relation to psychiatric classification. 
These results are in marked contrast to those 
of Chambers (3) who found clinical inter- 
preters of Rorschach protocols to demonstrate 
significant psychiatric validity. Five clinical 
Ss were selected from five behaviorally de- 
fined psychiatric categories (the four used in 
the present study plus a paretic group) with 
the selection criteria emphasizing unequivo- 
calness of diagnosis and intragroup homoge- 
neity in clinical behavior. Five total Ror- 
schachs, one from each psychiatric group, 
were submitted to 20 experienced Rorschach 
interpreters who were requested to psychi- 
atrically categorize these “blind” Rorschach 
protocols. The highly significant number of 
correct categorizations gave evidence of the 
adequacy of clinical judgment in psychiatric 
diagnosis. 

Probably the most adequate hypothesis to 
explain the differences in psychiatric validity 
of the Rorschach profile and judgment ap- 
proaches lies in the additional clinical infor- 
mation contained in a complete protocol that 
is not present in a profile of traditional Ror- 
schach scoring categories. Basically this added 
information is drawn from two aspects of the 
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protocol: the specification that a given re- 
sponse was elicited by a particular area on 
a particular card, and the intracard and in- 
tercard sequencing of the responses. These 
sources of information are ignored in the 
usual scoring elements, but are commonly 
noted by experienced clinicians as being 
highly important in forming their judgments. 
One suggested advance in the objectification 
of Rorschach validity would consist of the de- 
velopment of methods of defining and quanti- 
fying these card-area, response-sequence, and 
profile-element interrelationships as additional 
predictors of psychiatric diagnosis. Another 
area of investigation lies in curvilinear and 
discontinuous relations among quantified as- 
pects of Rorschach performance and between 
these aspects and psychiatric diagnosis. Most 
of the heretofore developed psychometric 
methods assume rectilinearity and continuity: 
assumptions that appear quite hazardous in 
defining the structure of human personality. 
Which of these several approaches is poten- 
tially most fruitful could be decided by addi- 
tional studies of how the practicing and ex- 
perienced Rorschach interpreter makes his 
judgments and what information is most nec- 
essary to him for valid diagnosis. 


Summary 


Four clinical Ss from each of four psychi- 
atric diagnostic categories (anxiety neurotic, 
depressive, paranoid schizophrenic, and im- 
becile) were selected on the criteria of intra- 
group similarity in clinically relevant behav- 
ior and unequivocalness of diagnosis. Each S 
was correlated with every other S on their 
scores on 42 traditional Rorschach scoring 
categories and an inverted factor analysis 
computed from the matrix of intercorrelations 
between Ss. None of the three orthogonal 
original or rotated factors were significantly 
related to the psychiatric grouping of the Ss 
when the Rorschach response productivity of 
the individual Ss was statistically eliminated. 
The lack of diagnostic validity of this purely 
psychometric approach compared to the sig- 
nificant validity of the clinical judgment 
method was attributed to several sources of 
information in the complete Rorschach proto- 





188 A. W. Bendig and Roy M. Hamlin 


col which are ignored by the usual scoring 
categories. 
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Various Rorschach Indices as Discriminators of Marked 
and Little Conceptual Disorganization Among 
Schizophrenics'’ 


Bernard A. Stotsky and James F. Lawrence 
VA Hospital, Brockton, Massachusetts 


Among the principal symptoms of schizo- 
phrenia are disorganization of thought, loss of 
coherence in speech, and the development of 
false beliefs (5). Goldstein and Sheerer (8) 
and Vigotsky (16) regarded the loss or im- 
pairment of abstract behavior and concept 
formation as the key psychological disturb- 
ance in schizophrenia. Storch (14) described 
schizophrenic thinking as the substitution of 
concrete behavior for conceptual thinking. 

Rawlings (12) making use of the Yerkes- 
Bridges Point Scale and tests of association 
and imagination found marked impairment 
of abstraction and generalization in schizo- 
phrenia. A number of authors (2, 3, 8, 9, 16) 
have reported the impairment of the abstract 
attitude and of the ability to shift from one 
set to another. Cameron’s (4) work did not 
support previous findings in that he found 
the “abstract capacity level” to exist among 
schizophrenics provided that adequate rap- 
port and cooperation were obtained. 

In the area of intelligence, Hunt and Cofer 
(10) have noted that schizophrenics do more 
poorly than nonpsychotics of similar mental 
age on such conceptual tests as interpreting 
fables, noticing absurdities, finding similari- 
ties, and on tests of associative thinking. 
Every author has found that the variability 
of performance of schizophrenics is marked, 
usually larger than that of any control popu- 
lation. In every experiment there has been a 
considerable amount of overlap in test scores 
between schizophrenics and controls. We have 


1From VA Hospital, Brockton, Massachusetts. 
Rorschachs administered and scored by Murray A. 
Cohen, Boston University trainee. 


seen that even in an experiment (9) where 
very significant differences were found be- 
tween schizophrenics and normals, only 26 
of the 62 schizophrenics showed deficit when 
comparisons were made with controls of com- 
parable educational background. 

All the investigations cited above have con- 
cerned themselves with the study of schizo- 
phrenia as a single entity and have attempted 
to isolate some one factor or group of factors 
which would differentiate the schizophrenic 
from the normal or from all other types of 
psychopathology. Only in the past few years 
have systematic investigations been made of 
differences between schizophrenics. In one 
study Stotsky (15) reported that schizo- 
phrenics who remit tend to show at the out- 
set of hospitalization better intellectual and 
emotional control, greater clarity of the as- 
sociative processes, and greater ability to 
concentrate than schizophrenics who remain 
hospitalized. 

Encouraged by these findings the authors 
of the present study sought to investigate 
differences in degree of conceptual disorgani- 
zation among schizophrenics. More specifi- 
cally the purpose of this investigation was to 
examine the validity of certain Rorschach 
indices of conceptual disorganization on two 
groups of schizophrenics, one characterized 
by relatively little impairment of conceptual 
functioning, the other by a great deal of im- 
pairment. 


Predictions 


1. Since the proportion of good form re- 
sponses in a record is regarded by Rorschach 
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(13) as a measure of the clarity of associa- 
tive processes and by Beck (1) as an indi- 
cator of accurate perception of reality, pa- 
tients manifesting little conceptual disorgani- 
zation will give a greater proportion of good 
form responses. All responses containing a 
form element in them (Ff, M, FC, CF, FY, 
YF, and FV) will be scored for good form. 

2. Since patients with less conceptual dis- 
organization may be expected to retain greater 
ability to integrate and articulate their re- 
sponses to ambiguous perceptual stimuli, these 
patients will show a greater proportion of 
genetically good location responses.” 

3. If the number of popular responses is 
taken as a measure of the patient’s intel- 
lectual adaptability and ability to “share in 
the common way of perceiving things” (13, 
p. 198), patients with little conceptual im- 
pairment will give a greater number of popu- 
lar responses. 

4. Patients with less conceptual disorgani- 
zation should show greater retention of the 
ability to shift from one set to another. 
Therefore, they will show more shift from 
free association to inquiry for Rorschach de- 
terminants. 


Procedure 


Multidimensional psychiatric rating scale. 
Many chronic schizophrenic patients trans- 
ferred to this newly opened neuropsychiatric 
hospital from other Veterans Administration 
installations were rated just prior to transfer 
by means of the Hospital Form of the Multi- 
dimensional Scale for Rating Psychiatric Pa- 
tients (11). This form consists of 62 indi- 
vidual, unlabeled graphic rating scales which 
secure in a relatively objective and quantita- 
tive form a description of the observable be- 
havior or readily inferable traits and common 
symptoms of hospitalized patients. Reliability 
of rating is high. The scale has discriminated 
severely ill from mildly ill psychotic patients 
for the factors of conceptual disorganization, 
perceptual distortion, and motor disturbances 
(11). 

From the 62 items of the scale, 12 factors 


2The scoring of the quality of location (whole, 
common detail, and rare detail) responses from a 
genetic point of view was accomplished by means of 
Friedman’s scoring criteria (6). 
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or dimensions of psychopathology have been 
obtained by its authors. One of these, factor 
K, “conceptual disorganization,” was selected 
as the criterion measure for differentiating pa- 
tients with marked disorganization of con- 
ceptual thinking from patients with relatively 
little. Factor X is characterized by irrelevant 
and incoherent speech, a stereotyped repeti- 
tion of certain words and phrases and a fre- 
quent blocking in speech, a disharmony be- 
tween the patient’s thinking and the feeling 
tone shown or expressed, and a disorientation 
for time. When present, delusional beliefs 
tend to be impossible or bizarre (11, p. 6). 

For each patient rated by the Multidimen- 
sional Scale the factor raw score on K was 
obtained by adding the numerical values of 
the ratings on the seven scale items descrip- 
tive of the factor. The items are listed below: 


1. How direct and relevant are his responses to 
questions or to the topic discussed ? 

3. Are his thoughts and feelings consistent, or is 
there a discernible lack of harmony between them? 
(Reports, smiling, that he was tortured with blow 
torches all night.) 

4. How well oriented is he as to time? Does he 
know (a) the season; (b) the month; (c) the 
calendar year; (d) the day of the week? 

11. Are the elements of his speech logically con- 
sistent and connected by some idea or relationship, 
or do they tend to be inconsistent and disconnected ? 
Rate what is most representative during the inter- 
view. 

28. Does he repeat certain words or phrases in a 
meaningless, stereotyped, or mechanical fashion? 

29. Is his speech irregularly interrupted, halted, or 
blocked for varying periods of time because he is 
not able to put the idea he has in words? (Do not 
rate stuttering, stammering, or muteness here.) 

38. Is there evidence of false ideas or beliefs? If 
present, are these ideas or beliefs (a) sufficiently 
plausible as to be accepted by a normal person un- 
informed as to the facts; (b) implausible but not 
impossible; (c) impossible or bizarre (e.g., mind 
controlled by neighbor’s radio waves, heart re- 
moved, or dead) ? 


Items 1, 3, 11, 28, 29, and 38 were rated 
on a four-point scale, item 4 on a five-point 
scale. Patients obtaining a score of 10 or be- 
low for K (10 was the mean score for K of 
the reference population drawn from five VA 
hospitals) were considered to show relatively 
low conceptual disorganization. Patients with 
scores of 16 or greater for K (highest 17% 
of reference population) were considered to 
show high conceptual disorganization. 





Rorschach Indices and Conceptual Disorganization 


Sampling. Patients were randomly selected 
from those scoring 10 or less and 16 or 
greater on K, and a Rorschach was adminis- 
tered individually to each testable patient 
within 60 days of admission to this hospital. 
The goal was to obtain 10 Rorschach rec- 
ords for each group. Of the first 15 patients 
approached in the low group, 10 cooperated 
sufficiently to give scorable Rorschachs with 
more than six responses. In the high group 
only 10 patients out of 23 were testable to 
the extent of meeting the criterion of seven 
responses to the 10 cards. 

Although all patients had taken Rorschachs 
within 60 days of admission, it was consid- 
ered desirable to rerate the patients on the 
seven criterion items of the scale to see 
whether any changes had taken place. The K 
scores for all 10 lows remained constant. For 
seven of the highs the scores remained con- 
stant. The scores for two increased slightly. 
The tenth high, however, had shown such 
marked improvement that his K score fell be- 
low 16, necessitating his elimination from the 
group. As finally constituted, the total sam- 
ple consisted of 10 low scorers and 9 high 
scorers on K, 

All subjects were white, male veterans be- 
low the age of 60. Comparison of the two 
groups for age, length of VA hospitalization, 
education, level of intelligence, schizophrenic 
subgroup, premorbid occupation, degree of 
rated disability, marital status, and religion 
showed no significant differences. In no case 
was there any evidence reported of cerebral 
lesion or neurological deficit. All patients had 
confirmed histories of schizophrenic symp- 
tomatology, accompanied by long periods of 
hospitalization. 

Rorschach scoring. The Rorschach proto- 
cols were scored for both free association and 
free association plus inquiry, using Beck’s 
system for scoring determinants, good and 
poor form, and popular responses. For loca- 
tion of responses the proportion of all good 
or mediocre, W, D, and Dd responses over 
total responses was used to compare the 
groups rather than considering each of the 
location categories separately. Thus, for W, 
D, and Dd, Friedman’s (6) plus-plus, plus, 
and mediocre scores were categorized as good, 
and vague, confabulatory, and minus re 
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sponses were categorized as poor. The de- 
gree of agreement between two clinicians 
scoring location by means of Friedman's cri- 
teria was greater than 85%, which was con- 
sidered adequate for this study. 

Prior to the analysis of findings both groups 
were compared for total number of responses 
(R). No significant difference was found be- 
tween the mean R of 17.8 for the low group 
and of 16.3 for the high group. 


Findings 


1. Prediction for good form. The predic- 
tion for good form was sustained at the .001 
level of significance with the low group giv- 
ing a significantly greater proportion of good 
form responses. There was actually no over- 
lap between the groups inasmuch as the high- 
est score for the highs was 70 and the lowest 
for the lows was 77.° 








Table 1 
Comparison of High and Low Patients for R 
Indices of Conceptual Disorganization 
Highs Lows 

Rorschach - 

variable Mean Sigma Mean Sigma 1 x? 
Good form 

per cent 47.0 17.1 864 60 15.4 .001 
Good location 

per cent 38.1 23.2 82.6 8.2 15.4 .001 
Popuiars > Som F S4...1412 33 01 
Total deter- 

minant shift 1.4 1.2 3.1 12 2.4 





2. Prediction for location. The difference 
between the low and high groups was signifi- 
cant at the .001 level in the predicted direc- 
tion for location. Here also there was no over- 
lap on scores between the two groups, the 
highest score for the highs being 63, the low- 
est for the lows, 71. 

3. Prediction of populars. The prediction 
with regard to P was supported at the .01 
level of significance with the low group giv- 
ing almost twice as many popular responses 
as the high group. 

® For Predictions 1 and 2 it was necessary to use 
chi square rather than ¢ in the analysis of the data 
since the variances for the high group were signia- 
cantly greater than those for the low group. For 
good form per cent the cutting point was 75, for 
good location per cent the cutting point was 70 
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4. Prediction for total determinant shift. 
The low group showed significantly greater 
shift (p= .03) for determinants from free 
association to inquiry. This confirms the pre- 
diction that such a group is better able to 
shift from one set to another and is more re- 
sponsive to a change in stimulus. 


Discussion 


All four predictions concerning the Ror- 
schach indices of conceptual disorganization 
were confirmed below the .05 level of signifi- 
cance. Of greater importance was the fact 
that for two of these indices, good form per 
cent and good location per cent, there was no 
overlap between the two groups. Tentatively 
these findings suggest that low scores for 
these two Rorschach variables may be taken 
to indicate conceptual disorganization among 
schizophrenic patients. 

The findings for form and populars sup- 
port the notion that the schizophrenics low 
on conceptual disorganization are able to as- 
sociate accurately to the blots and to share 
in the common way of perceiving things. The 
superior performance of the lows on the lo- 
catic.. categories suggests that these patients 
not only retain their ability to integrate their 
responses to the plates, but that they may 
even tend to approach the level of response 
of neurotics and normals on these variables. 

While the difference between the two groups 
for determinant shift is significant, it is note- 
worthy that neither group shifts a great deal. 
Gibby and Stotsky (7) in comparing psy- 
chotic and psychoneurotic patients found 
that the average amount of shift for the psy- 
chotic group was 3.15 while that for the neu- 
rotic group was greater, 5.15. Thus in the 
present study the psychotic patients with low 
conceptual disorganization show roughly the 
same a:nount of shift as the psychotics 
studied «t a VA general medical and surgical 
hospital (7). The average shift for the high 
and low groups combined is much lower, 
roughly 2.3. The smaller amount of shift for 
these patients as compared with the psychot- 
ics in the earlier study may be due to their 
being more chronically schizophrenic and as 
a result of their illness probably less respon- 
sive to the shift in stimulus from free asso- 
ciation to inquiry. 
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Of what significance are the present find- 
ings with regard to schizophrenia in general? 
The definite differences, on both the rating 
scale and the Rorschach indices, between the 
two groups, both of which consisted of 
chronic schizophrenic patients, raise the ques- 
tion as to the advisability of regarding all 
schizophrenics as individuals incapable of 
higher level thinking such as is found in nor- 
mals. The present findings indicate that there 
are many schizophrenic patients who are able 
to shift from one set to another and who can 
integrate perceptual stimuli in a manner simi- 
lar to that of nonschizophrenic subjects. 
Among such patients the schizophrenic proc- 
ess does not manifest itself by marked con- 
ceptual disorganization. In fact they show 
minimal conceptual impairment on the meas- 
ures employed in this study. 


Summary 


1. By means of the Multidimensional Psy- 
chiatric Rating Scale, 10 chronic schizo- 
phrenic patients showing little conceptual dis- 
organization and 9 chronic schizophrenic 
patients showing marked conceptual disor- 
ganization were selected and compared on 
four Rorschach indices of conceptual disor- 
ganization: good form per cent, good location 
per cent, populars, and determination shift. 

2. It was predicted that patients low on 
conceptual disorganization would show a 
greater proportion of good form and good 
location responses, a greater number of popu- 
lars, and more determinant shift from free 
association to inquiry than patients high on 
conceptual disorganization. All predictions 
were confirmed at significant levels of con- 
fidence. 

3. The findings were discussed with regard 
to their application to the consideration of 
schizophrenia as a single entity. 


Received August 19, 1954. 
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The Rorschach may be used to rate indi- 
viduals on single or multiple dimensions with 
respect to relative or absolute standards. Al- 
though the literature on the Rorschach has 
been concerned mainly with the adequacy of 
clinical descriptions of individuals, in some 
situations, as in a prison, it is often of greater 
practical importance to rate people with re- 
spect to one over-all quality, such as “adjust- 
ment.” For classification boards and parole 
committees the important question is “How 
is he?” rather than “What is he?” Prison offi- 
cials want to know whether an individual 
placed in a particular situation of trust will 
be able to meet it effectively, and they are not 
too concerned with why or how he will suc- 
ceed. They are interested in meaningful pre- 
Jictive statements and not in understanding 
individual dynamics. 

In either case, the validity of the Rorschach 
is at issue. The difficulties inherent in the 
validation of projective techniques have been 
ably summarized by Macfarlane and Tudden- 
ham (2) and need not be reviewed again. 
One kind of validation may be made by test- 
ing for criterion discrimination, that is the 
capacity of the Rorschach to separate indi- 
viduals known or believed to differ with re- 
spect to some criterion quality. However, to 
eliminate other cues, such as will occur in 
face-to-face contact between the examiner 
and the subject, it is necessary to have Ror- 
schachs evaluated blindly. This is particu- 
larly true in this investigation, since half of 
the subjects were prisoners, the other half 
were prison guards. 


The Problem 


In a prior paper (1) it was reported that, 
when 50 “normal” Rorschach and 50 “abnor- 
mal” Rorschachs were evaluated by means of 
a check list, the rank-ordered ratings placed 
29 normals in the top 50 rather than the 
chance expectancy of 25. The degree of sepa- 
ration was not quite significant at the 5 per 
cent level. The question arises of the capacity 
of psychologists to separate such protocols by 
clinical judgment in contrast to the mechani- 
cal procedure of the check list. 


Method 


The same 100 protocols used in the prior 
investigation were used in this study. The 50 
normal protocols were of newly hired prison 
guards who took the Rorschach as part of the 
routine of indoctrination into the penal serv- 
ice. They had passed civil service investiga- 
tions of their character, had the equivalent of 
high school education, and had passed two in- 
terviews designed, among other purposes, to 
evaluate their emotional stability. There was 
no reason to believe that they were in any 
way different with respect to normality of 
personality from an unselected group of the 
same age. If anything, they may have been 
superior, as a result of the screening. The 50 
abnormal protocols were of prisoners who had 
committed serious felony crimes and who had 
been referred to the psychologists for projec- 
tive testing by the psychiatrist because in his 
opinion they showed signs of serious person- 
ality deviations. About 10 per cent of prison- 
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ers in this particular institution are referred 
for further intensive testing which usually in- 
cludes the Rorschach. 

Each of the guard protocols was matched 
with that of a prisoner. Ages were balanced 
within three years; only inmates who had rat- 
ings of average intelligence, and who came 
from lower middle social classes, were used. 
All subjects were white. 

The protocols were assembled with a sum- 
mary sheet. All identifications were removed. 
The protocols were then given to three psy- 
chologists with instructions to rank them 
from 1 to 100 with respect to the most prob- 
able adjustment of the subjects. Psycholo- 
gists A and B were asked to take as much 
time as they considered necessary to make a 
good decision. Psychologist C was asked to 
make “snap” decisions. Psychologist D, who 
had administered all the Rorschachs, was 
asked to rank the summaries of his reports 
from 1 to 100 in terms of favorableness with 
respect to adjustment. 


Results 


Psychologist A averaged 7.2 minutes per 
judgment; Psychologist B, 6.0 minutes; and 
Psychologist C, 1.1 minute. Time was not re- 
corded for Psychologist D. The intercorrela- 
tions of the rankings between the psycholo- 
gists were very low, in terms of usual psycho- 
metric expectations. The rankings of A and 
B, A and C, and B and C, correlated only 
.22, .35, and .45, respectively. The judgments 
of these three psychologists, respectively, 
against the ratings of the report summaries 
prepared by Psychologist D, correlated .33, 
.58, and .50. 

By chance it might be expected that 25 of 
the normal protocols might be placed in the 
top half of the rankings. It had been found 
that through using the Davidson Scale 29 
were placed in the top half. Psychologists A 
and B placed, respectively, 34 and 42. Both 
exceeded the 1 per cent level of significance. 
Psychologist C, who worked very rapidly at 
the rate of about one judgment per minute, 
exceeded the Davidson Scale by placing 30 of 
the guard protocols rather than the chance 
expectancy of 25, thereby just meeting the 5 
per cent level of significance. All three of the 
psychologists, who used their clinical judg- 
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Table 1 
Number of Guard Protocols in Each of the 
Four Quartiles of Ratings 
Quartiles 
Psychologist I II Ill IV 
A 20 14 10 6 
B 23 19 6 2 
Cc 18 12 12 s 
D 24 16 s 2 


ment, were better than a check list for the 
purpose of separating this population of Ror- 
schach protocols with respect to “normality.” 
Psychologist D, who had worked from his 
own summary reports, put 40 of the guards in 
the top 50. Psychologist D’s results, however, 
could be seriously contaminated because he 
saw the examinees in face-to-face situations 
and could have been influenced by clues not 
obtained from the Rorschach. The placements 
of the four psychologists in terms of quartiles 
is shown in Table 1. 


Summary 


The most surprising result of this investi- 
gation is the very low order of reliability, 
in terms of usual psychometric expectations, 
found in the rankings of the Rorschach pro- 
tocols by the psychologists with respect to 
normality. Of course, no information exists as 
to the range of “normality-abnormality” of 
the subjects, but presumably there was no 
attenuation due to restriction of range. 

It would appear that adequacy of judg- 
ments is a function of time spent on the judg- 
ing process. The three psychologists were con- 
sidered equal in their abilities, but the psy- 
chologist who made the “snap” judgments 
did more poorly than the other two who 
worked at what they considered optimal 
speed. The best score of all, obtained by Psy- 
chologist B, a new employee of the Depart- 
ment of Public Welfare, was obtained despite 
the fact that he had not known that the pro- 
tocols were of guards and prisoners. 

The Davidson Rorschach Adjustment Scale 
selected 58 per cent of normals, the psycholo- 
gists, 60, 68, and 82 per cent from the proto- 
cols, and the psychologist working from his 
own reports, 80 per cent. 
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It may be concluded that psychologists’ 
clinical judgments of Rorschachs, while quite 
deficient in agreement with one another, are 
nevertheless superior to the Davidson Ror- 
schach Adjustment Scale for the purpose of 
separating two groups of adults who differ 
with respect to over-all social adjustment. 


Received September 27, 1954. 
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The Effect of the Psychotherapist’s Personal 
Analysis Upon His Techniques’ 
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Washington, D. C2 


Two previous publications have presented 
objective evidence on (a) the techniques of 
Rogerian and psychoanalytically oriented psy- 
chotherapists (5), and (5) the verbal behav- 
ior of psychiatrists, psychologists, and psy- 
chiatric social workers of varying levels of 
experience (6). These articles also set forth 
the rationale for focusing attention upon the 
psychotherapist and for studying systemati- 
cally the concomitants of his techniques. The 
present paper is an extension of the previous 
inquiry, and attempts to answer the question: 
What is the effect of the therapist’s personal 
analysis upon his therapeutic techniques? 

Since the time of Freud no training require- 
ment for the psychotherapist has been consid- 
ered more important than his personal analy- 
sis, and the curricula of psychoanalytic train- 
ing institutes have traditionally been built 
around this aspect of the candidate’s training. 
In view of the voluminous literature on the 
subject, little need be said about the argu- 
ments commonly advanced in favor of this 
training requirement. Fromm-Reichmann’s 
summary, while perhaps unduly restrictive, is 
representative of current thinking: “And so 
it is that, because of the interrelatedness be- 


1 This article is the third in a series of investiga- 
tions concerned with the objective study of psycho- 
therapists’ verbal operations, and represents one 
phase of the author’s doctoral dissertation submitted 
to the Graduate Council of the George Washington 
University (4). Part of this material was presented 
at the annual meeting of the American Psychologi- 
cal Association in New York, September 1954. The 
writer wishes to express his appreciation to Dr. 
Dorothy E. Green for valuable statistical advice. 

2While carrying out this work, the author held 
positions successively with the Department of the 
Air Force and the Department of the Army. Neither 
organization has any connection with this research. 


tween the psychiatrist’s and the patient’s in- 
terpersonal processes and because of the in- 
terpersonal character of the psychotherapeutic 
process itself, any attempt at intensive psy- 
chotherapy is fraught with danger, hence un- 
acceptable, where not preceded by the future 
psychiatrist’s personal analysis” (2, p. 42). 

If this position is correct, the experience of 
personal analysis should have a demonstrable 
effect upon the character of the therapist’s 
operations. Even if present-day methods for 
objectively examining such an elusive vari- 
able are crude, there must be systematic dif- 
ferences between the techniques of therapists 
whose training has included personal analysis 
and those whose training has not. After the 
exploration of this general hypothesis, the 
problem will be pursued with reference to a 
few salient therapeutic situations. 


Three Kinds of Patient Maneuvers 


Suicide. threats. A suicide threat is one kind 
of power operation by which the patient at- 
tempts to bring a significant person under his 
control. The reaction is, of course, the cul- 
mination of a long series of frustrations, re- 
buffs, and injuries which have been inflicted 
in reality or fantasy upon the victim, and 
constitutes a grandiose attempt at revenge by 
turning the tables upon the perceived aggres- 
sor. Without attempting to give the dynamics 
more than cursory treatment, it may be rec- 
ognized that a suicide threat contains as its 
essential ingredients: (@) profound disap- 
pointment, frustration, and suffering; and 
(6) pervasive reactions of rage, aggression, 
and hostility. 

Operationally, self-destructive threats ex- 
pressed by a patient may serve to evoke the 
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therapist’s pity and commiseration, they may 
be designed to perpetuate a dependent rela- 
tionship, or they may represent a maneuver 
to dominate and to make the therapist re- 
sponsible for the patient’s actions. 

With regard to recommended therapeutic 
approaches, Fromm-Reichmann states: “A 
thorough investigation of the validity of the 
causes for the patient’s discouragement, un- 
happiness, or despair must be the starting 
point of every therapeutic approach to sui- 
cide, or suicidal attempts and fantasies” (2, 
p. 198). By the same token, reassurance, pity, 
etc. appear to be ill-advised: “If the therapist 
offers encouragement to patients’ parataxic 
expectations by actually falling into the role 
of practical adviser, he retards the process of 
insight into the immature character of such 
expectations; hence he retards the process of 
resolving them” (2, pp. 208-209). 

The following hypothesis tests 
therapists do what they preach: 


Hypothesis 1. When confronted with sui- 
cide threats, psychotherapists include in their 
responses (a) a relatively large number of 
explorations; (4) a relatively small number 
of responses that convey reassurance; and 
(c) a relatively small number of interpreta- 
tions. 

If the therapist who has undergone per- 
sonal analysis is more clearly aware of the 
dynamics of the situation, Hypothesis 1 may 
be extended by stating that the predicted 
trend is more pronounced for therapists whose 
training has included personal therapy. 

Transference reactions. In its special psy- 
chotherapeutic meaning, transference refers to 
all processes whereby the patient re-enacts in 
the therapeutic situation unresolved interper- 
sonal conflicts, and casts the therapist into 
the role of the significant adult who figured 
prominently in the earlier experience. Trans- 
ference constitutes the cornerstone of all 
modern conceptions of psychoanalytic ther- 
apy and is increasingly utilized by other 
forms of therapy as well. This emphasis also 
implies that a significant segment of the 
therapeutic work consists in clarifying the 
doctor-patient relationship and concomitantly 
the distortions which complicate the patient’s 
living. 
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The giving of interpretations has to be 
carefully planned and timing is all-important. 
Interpretations which are offered prematurely, 
that is, before the establishment of a rela- 
tively durable doctor-patient relationship and 
relaxation of the patient’s emotional defenses, 
are useless if not positively harmful. Still, 
transference reactions manifested by the pa- 
tient at almost any stage of the treatment 
are likely to be dealt with by means of inter- 
pretations, particularly by analyzed thera- 
pists who themselves have been exposed to 
the psychoanalytic maxim of giving primary 
attention to transference phenomena. The 
second hypothesis is: 

Hypothesis 2. When confronted with trans- 
ference reactions, psychotherapists employ a 
relatively large number of interpretations. 
Analyzed therapists exhibit this tendency to 
a greater extent. 

Schizoid productions. The third hypothesis 
deals with the productions of a single patient 
who appears seriously disturbed, and who 
gives the impression of being on the verge of 
a schizophrenic break. It appears plausible 
that such a patient would be treated differ- 
ently early in therapy than one who is less 
disturbed, anxious, and confused. For one 
thing, the therapeutic effort obviously must 
be directed at bringing about a diminution 
rather than an increase in anxiety; for an- 
other, the therapist must attempt to obtain 
a fuller picture of the underlying dynamics. 
Since the technique of silence is often used to 
create “a sort of vacuum [which] pushes the 
patient into taking the initiative in the rela- 
tionship” (7, p. 221), it increases of neces- 
sity the patient’s anxiety, for which reason it 
would be contraindicated with a very anxious 
patient. As for the second objective, a fair 
number of exploratory questions may be an- 
ticipated. On the assumption that the ana- 
lyzed therapist will be more sensitive to the 
requirements of the situation, Hypothesis 3 
may be stated: 

Hypothesis 3. When confronted with the 
productions of a seriously disturbed, near- 
psychotic patient, psychotherapists employ 
(a) a relatively small number of silent re- 
sponses, and (0) a relatively large number of 
explorations. Analyzed therapists follow this 
trend in more clear-cut fashion. 
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Method, Procedure, and Subjects 


A fuller description has been given in the 
earlier article (6). For the present, it suffices 
to reiterate that samples of psychotherapists’ 
verbal behavior were elicited by presenting a 
series of 27 cards containing short paragraphs 
of patient statements selected from published 
therapeutic interviews. Included in the series, 
which was considered a fair cross section of 
verbalizations heard from neurotic patients 
in early interviews, were four suicide threats, 
six transference reactions (including an in- 
sistent request for direct advice, denials of 
the need for therapy, and examples of com- 
petitiveness and open hostility), and six com- 
plaint statements of a seriously disturbed 
near-psychotic patient.® 


Table 1 


The Sample by Professional Affiliation, Experience 
Level, and Personal Analysis 














Personal No personal 
analysis analysis 
In- Sub- In- Sub- 
Group Exp. exp. total Exp. exp. total Total 
Psychiatrists 14 3 17 1 7 8 25 
Psychologists 3 3 6 0 1 1 7 
Psychiatric 
social 
workers eas 7 1 1 2 9 











Total 23. honed 2 9 11 41 





Subjects were the same 25 psychiatrists, 7 
psychologists, and 9 psychiatric social work- 
ers who participated in the previous research 
(6). All were psychoanalytically oriented. 
Length of experience in psychotherapy ranged 
from 1 to 12 years, with a median of 5 years. 
Practically all of the experienced therapists 
(S years and above), and about half of the 
inexperienced therapists (less than 5 years) 
had undergone personal analysis as part of 
their training. These data are shown in Table 
1. While no systematic information was ob- 
tained concerning the length of a respondent’s 
personal analysis, it is safe to assume that for 
the most part it had been fairly extensive— 


% Adapted from Rogers’ case “Miss Tilden,” re- 


ported in Snyder (3); by permission of Professor 
Snyder. 
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Table 2 


Brief Description of Scoring Categories 





Category 
number Core meaning 
1 Reassurance 
2 Not applicable* 
3 Silence 
(passive acceptance 
4 Structuring 
5 Interpretation 
6 Reflection of feeling 
7 Factual questions 
g Exploration 
9 Not applic able* 
10 Passive rejection 
11 Not applicable* 
12 Antagonisn 
* Probably a function of the experimental cond 
which responses were elicited 


certainly beyond one year. A few therapists 
who had only recently started their analysis 
were included in the unanalyzed group. 


Results 


Quantification of therapists’ responses was 
accomplished by means of Bales’ (1) system 
of interaction process analysis. A total of 
1,609 score units was categorized by this 
method. Rater agreement was tested on a 
stratified random sample of 370 score units 
(10 cases) and found to be 78 per cent. A 
brief description of the scoring categories is 
given in Table 2.* 

Figure 1 presents the response profiles of 
analyzed and unanalyzed therapists, regard- 
less of professional affiliation. The statistical 
analyses test the significance of percentage 
differences in a particular response category; 
t values significant at the .01 or .05 level are 
indicated in the figure. The findings may be 
summarized as follows: 


1. The two response profiles show consid- 
erable overlap in several of the major cate- 
gories; nevertheless, a chi square computed 
on an over-all basis is significant beyond the 
.001 level of confidence. 

2. A pronounced discrepancy occurs in 
Category 3, denoting that analyzed therapists 
give significantly fewer silent (passive accept- 


* For a fuller description of the scoring categories, 
see the earlier publications (5, 6) 
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Fig. 1. Response distributions of analyzed and un- 
analyzed therapists (personal analysis, m = 1,173; no 
personal analysis, m = 436). 


ance) responses, and conversely that unana- 
lyzed therapists are more passive.® 

3. Small, but statistically significant differ- 
ences in Categories 10 and 12 (passive rejec- 
tion and antagonism), while possibly quite 
real, are based upon small frequencies and 
should be regarded as tentative. 

Since the mean length of experience of 
analyzed therapists was significantly greater 
than that of unanalyzed therapists (F = 
6.33, p< .05), there was reason to question 
whether the significant difference between 
analyzed and unanalyzed therapists with re- 
spect to the number of silent responses was 
attributable to personal analysis, or whether 
it was rather a function of length of experi- 
ence. The analysis of covariance technique 
was employed to determine whether the dif- 
ference between the number of silent re- 
sponses by analyzed and unanalyzed thera- 
pists was significant when the effect of 
experience was controlled. The obtained F 
value of 9.39 (p < .01) indicates that, as far 
as the present data are concerned, the differ- 
ence in silent responses between analyzed and 
unanalyzed therapists is attributable to the 
variable of personal analysis. This result was 
to be expected since the correlation between 
length of experience and number of silent re- 
sponses is .12, which is not significantly dif- 
ferent from zero. 

To test the three specific hypotheses, thera- 
pists’ responses to each subseries (suicide 


threats, transference reactions, and schizoid 


5 The findings reported for Rogerian therapists (5) 
corroborate this result. 
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productions) were compared with responsés 
to all other statements. These operations were 
performed separately for analyzed and for 
unanalyzed therapists. Subseries responses 
were, of course, always excluded from the 
over-all frequencies. 

Suicide threats. Figure 2 shows the re- 
sponse profiles of analyzed and unanalyzed 
therapists to suicide threats and to all other 
statements. The following differences may be 
observed: 


1. Unanalyzed therapists give a larger 
number of reassuring responses (Category 1) 
to suicide threats; for analyzed practitioners 
the trend is in the same direction (¢ = 1.95, 
p< .06). 

2. Unanalyzed therapists disclose a signifi- 
cant decrease in reflective responses (Cate- 
gory 6). 
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Fig. 2. Response distributions of analyzed and un- 
analyzed therapists to suicide threats and to all other 
statements (personal analysis: suicide, m = 182; over- 
all, m=991; no personal analysis: suicide, n = 64; 
over-all, 2 = 372). 
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Fig. 3. Response distributions of analyzed and un- 
analyzed therapists to transference reactions and to 
all other statements (personal analysis: transference, 
n= 273; over-all, m=900; no personal analysis: 
transference, » = 99; over-all, m = 337). 


3. Both groups show a slight preference 
for structuring responses (Category 4) when 
dealing with suicide threats; however, neither 
difference is significant at the .05 level (ana- 
lyzed group: ¢ = 1.63, p< .11; unanalyzed 
group: ¢ = 1.94, p< .06). 

None of the above outcomes is in accord- 
ance with Hypothesis 1; in fact, the incre- 
ment in reassuring responses is in opposition 
to the prediction, except that unanalyzed 
therapists appear to give a relatively larger 
number of reassuring responses than do ana- 
lyzed workers. 

Transference reactions. Figure 3 compares 
the response profiles of analyzed and unana- 
lyzed therapists to transference reactions and 
to all other statements. The following observa- 
tions may be made: 


1. Both groups show an increase in inter- 
pretive responses (Category 5), but the dif- 
ference is statistically reliable only in the 
case of the analyzed therapists. 

2. Both analyzed and unanalyzed prac- 
titioners tend to give a larger proportion of 
silent responses (Category 3), but again the 
difference is significant only for the former 
group. It should be recalled here that this 
category revealed a significant discrepancy in 
the comparison of over-all distributions, un- 
analyzed workers exceeding the analyzed 
therapists. 

3. Increments in structuring (Category 4) 
and decrements in exploratory responses 
(Category 8) are apparent for the analyzed 
group. These trends appear to be paralleled 
by the unanalyzed group, but here the differ- 
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Fig. 4. Response distributions of analyzed and un- 
analyzed therapists to schizoid productions and to all 
other statements (personal analysis: schizoid produc- 
tions, nm = 274; over-all, m = 899; no personal analy- 
sis: schizoid productions, » = 94; over-all, m = 342). 
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ences are not significant, probably because 
the absolute numerical frequencies are con- 
siderably smaller. The difference in Category 
7 is inconsistent and may be due to chance. 

Hypothesis 2, which predicted an increase 
of interpretive responses, is thus partially 
confirmed; however, the verbal behavior of 
analyzed therapists seems to follow the same 
trend as that of their unanalyzed confreres. 
Collateral statistical treatments indicated that 
psychologists and social workers do mot con- 
tribute to this shift. 

Schizoid productions. Figure 4, which pre- 
sents the by now familiar comparisons for 
schizoid productions, leads to the following 
interpretations: 


1. Analyzed therapists’ responses are char- 
acterizea by a marked decrease in silent re- 
sponses (significant at the .05 level), whereas 
unanalyzed therapists reveal an increase (not 
statistically significant). 

2. Analyzed therapists use a proportionate 
number of explorations with this patient; 
unanalyzed workers, on the other hand, give 
significantly fewer exploratory responses. 

3. The differences in Category 7 (direct 
questions) are traceable to an artifact in the 
items of the subseries. Another inconsistent 
difference in Category 6 may be due to 
chance. 

Hypothesis 3 is partially confirmed by 
these results, particularly with respect to the 
decrease in silent responses for analyzed 
workers. If it is recalled that analyzed thera- 
pists were shown to be more active in the 
over-all comparisons, it follows that the gap 
between the analyzed and the unanalyzed 
groups is widened when responses to this pa- 
tient occupy the focus of the statistical treat- 
ment. Exploratory responses, while undergo- 
ing no shift in the case of analyzed thera- 
pists, reveal a pronounced decrease in the 
unanalyzed group. This result is likewise in 
the predicted direction. 


Discussion 
In interpreting the results of this study cer- 
tain limitations must be kept in mind: 


1. Therapists’ responses were secured by 
means of an experimental model whose va- 
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lidity remains to be tested. It is thus not 
known to what extent the therapist’s behav- 
ior in the experimental situation coincides 
with his behavior vis-a-vis a patient. 

2. The experimental statements were brief 
and out of context; there was a minimum 
of background information on each patient; 
and, of course, the all-important therapist- 
patient relationship was lacking. 

3. The three specific hypotheses were tested 
only in an approximate manner, and the tests 
are altogether relative since the reference 
distributions were composed of heterogeneous 
elements whose representativeness is not 
known. The number of suicide, transference, 
and schizoid items, moreover, was small so 
that reliable differences were difficult to dem- 
onstrate. 

Within these limitations, the present in- 
vestigation provides objective evidence on a 
problem about which much has been written 
but on which no quantitative data have yet 
been adduced. 

The finding that analyzed therapists tend 
to be more active than their unanalyzed col- 
leagues is a provocative one, but it runs 
counter to an intuitive hunch to the effect 
that the experience of personal analysis en- 
ables the therapist to maintain distance from 
the patient’s maneuvers, and consequently 
that he feels less compelled to respond to 
any and all of the patient’s communications. 
Rather, the present results may signify that 
the analyzed therapist is more skilled or more 
willing to formulate an immediate response. 
An alternative interpretation might be that 
the unanalyzed therapist is more passive when 
confronted with a series of patient statements 
in an experimental setting, but that this trend 
would be reversed in actual therapy. He 
might be more wary, more hesitant, and 
more unsure of himself and of his thera- 
peutic procedures when called upon to give 
an account of himself, so to speak, to an out- 
sider. It must be remembered here that length 
of experience per se does not account for the 
difference which, by this evidence, seems to 
be due to personal analysis or some related 
variable. 

With respect to suicide threats, it is seen 
that contrary to recommendations made in 
the literature, all therapists tend to empha- 
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size reassuring responses. At the same time, 
we may observe a consistent but statisti- 
cally not significant increase in structuring 
responses which attempt a clarification of the 
patient’s and the therapist’s roles in the treat- 
ment situation. Such responses emphasize the 
reality aspects in opposition to the parataxic 
character of the patient’s perceptions. The 
latter finding, however, must be regarded as 
very tentative. Also, it should not be for- 
gotten that exploratory responses, while evi- 
dently not subject to shift in this instance, 
account for about a third of all therapist re- 
sponses. 

In accordance with psychoanalytic theory, 
therapists appear to single out transference 
phenomena for interpretive attention, al- 
though the experience of personal analysis 
seems to have little effect on the therapist’s 
verbal behavior in this case. As has been 
shown earlier (6), a predilection for inter- 
pretations is a distinguishing feature of all 
psychiatrists. The data would indicate that 
transference reactions act as a signal for the 
therapist to interpret, to be silent, or to de- 
fine the therapeutic situation. Also, he seems 
to be less inclined to ask the patient for an 
elaboration of his feelings. However, since 
the findings for the smaller unanalyzed group 
lack conclusiveness, it would be less than 
cautious to attribute the observed differences 
to the variable of personal analysis. 

The data presented for the productions of 
a very anxious and near-psychotic patient 
point to differential handling by analyzed 
and unanalyzed therapists, and tend to con- 
firm the third hypothesis. Silent responses ap- 
pear to be sensitive to this kind of stimulus 
material such that analyzed therapists be- 
come more active (the evidence is not con- 
clusive for unanalyzed practitioners). The 
decline in exploratory responses by unana- 
lyzed therapists may be the complement of 
this trend. In the context of the present re- 
sults, this may perhaps be regarded as tenta- 
tive evidence for the analyzed therapist’s 
greater sensitivity to the demands of the 
therapeutic situation; and if one meaning 
underlying silent responses is the attitude of 
“playing it safe,” there may be here an indi- 
cation of the analyzed therapist’s greater 
versatility and readiness for verbal participa- 
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tion. However, this extrapolation is clearly a 
speculation. 

In conclusion, this study has presented in- 
complete but certainly provocative evidence 
concerning differences in the therapeutic be- 
havior of analyzed and unanalyzed psycho- 
therapists. Contrary to prediction, personal 
analysis seems to lead to greater rather than 
to diminished activity on the therapist’s part. 
The implications as well as the generaliz- 
ability of these findings remain to be ex- 
plored further, but there can be no doubt that 
the problem is one of the first magnitude— 
for theory, for practice, and for training 


Summary 


This is the third and concluding article in 
a series of investigations to elucidate the psy- 
chotherapist’s contribution to the treatment 
situation. It studies the effect of the thera- 
pist’s personal analysis with reference to a 
series of patient communications and to (a) 
suicide threats, (6) transference reactions, 
and (c) schizoid productions. Therapeutic re- 
sponses were secured from 25 psychiatrists, 
7 psychologists, and 9 psychiatric social work- 
ers of varying degrees of professional experi- 
ence by presenting a series of 27 patient 
statements extracted from actual therapeutic 
interviews. A total of 30 therapists had un- 
dergone personal analysis as part of their 
training. Therapists’ responses numbering 
1,609 were categorized by Bales’ system of 
interaction process analysis. Average rater 
agreement was 78 per cent. The major re- 
sults may be summarized as follows: 

1. Compared with unanalyzed ther 
analyzed practitioners tend to be more ac- 
tive, as evidenced by a significantly smaller 
number of silent responses. 

2. Suicide threats evoke an increased num- 
ber of reassuring responses from both thera- 
pist groups. 

3. In dealing with transference phenomena, 
analyzed therapists tend to prefer interpreta- 
tions, silence, and structuring responses. The 
results for the unanalyzed group are incon- 
clusive. 

4. Schizoid productions of a seriously dis- 
turbed patient appear to induce a smaller 
number of silent responses in analyzed thera- 


pists, 
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pists and a smaller number of exploratory re- 
sponses in unanalyzed therapists. Since the 
foregoing results are significant only for one 
group, they must be considered tentative. 

Within the limitations of this study, per- 
sonal analysis has a demonstrable effect on 
the therapist’s verbal behavior. It was shown 
that this effect is independent of the thera- 
pist’s level of experience. 

This series of investigations has indicated 
the feasibility of studying objectively some 
aspects of the psychotherapist’s techniques. 
There is reason to believe that this focus on 
the therapist's contribution to the treatment 
situation is of theoretical and practical value 
for advancing our knowledge of the process 
of psychotherapy. 


Received October 8, 1954. 
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Changes in the Self Concept Without Psychotherapy’ 


Donald M. Taylor 


Vanderbilt University * 


Psychologists approaching the problem of 
evaluation of psychotherapy have reported 
that a number of alterations in the self con- 
cept seem to accompany “improvement” as 
rated by the therapist. Studies by Raimy (5, 
6), Rogers (7, 8, 9), Hartley (3), Dymond 
(2), and others indicate that the following 
changes are characteristic: 

1. The self concept becomes more positive. 

2. It becomes more congruent with the 
self-ideal. 

3. It becomes more self-consistent. 

These, then, might be considered possible 
criteria for success in counseling. One diffi- 
culty has been the lack of information on self- 
concept changes which take place without 
therapy. Dymond (2) has summarized data 
on one control group of 15 noncounseled sub- 
jects at the University of Chicago Counseling 
Center which indicate no significant change 
in positive-negative adjustment score over a 
period of several months. 


Problem 


To obtain data on changes in the self con- 
cept for persons not receiving counseling, a 
self-concept scale was devised and utilized in 
replicated Q-sort descriptions of self by 147 
college students and 21 adults in evening col- 
lege extension classes. 

Because the process of repeated introspec- 
tion on the self might be expected to lead to 


1 Based upon a dissertation submitted in partial 
fulfillment of the requirements for the Ph.D. degree, 
Vanderbilt University. The writer wishes to express 
his gratitude to Dr. George E. Copple for his advice 
and guidance. 

A summary of this article was presented at the 
1954 meeting of the Southern Society for Philosophy 
and Psychology. 

2 The writer is now Director of Psychological Serv- 
ices for the Tennessee Department of Mental Health. 


some of the self-insights anticipated in ther- 
apy, it was hypothesized that self-concept al 
terations similar to those reported in success- 
ful counseling would be revealed in repeated 
self-descriptions without psychotherapy. The 
Q-sort methodology was employed to permit 
comparison with results of recent research 
projects dealing with changes accompanying 
therapy which have made use of this tech- 
nique (2, 3, 4, 7, 8, 9, 10). 

The self concept was defined as the indi- 
vidual’s phenomenologically unitary constel- 
lation of beliefs about and attitudes toward 
himself, the organization of his self-reflexive 
affective-cognitive structures," as reflected op- 
erationally in his description of himself 


Method 

Items for the Q-sort self-concept scale were 
derived from 200 anonymous self-descriptions 
by university students and urban adults. Self- 
statements were classed as positive or nega- 
tive on the basis of ratings by a panel of 
eight judges. A preliminary set of 180 items 
was reduced to the final 120, 60 positive and 
60 negative, by item analysis of 
with 26 subjects. 

Subjects were asked to describe themselves 
in terms of these items by distributing the 
self-statements into eleven categories, using 
a platykurtic seminormalized distribution, 
shown in Table 1. The same items and the 
same distribution were employed for descrip- 
tion of the self-ideal, defined operationally as 
the subject’s description of the self he would 
like to be. 

This design makes it possible to correlate 
item by item, the pattern of self-description 


slant ‘ } 
a pilot study 


$ Definition formulated by the Vanderbilt-Peabody 
self-concept research group, directed by Dr. Ted 
Landsman. 
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Table 1 


Card Distribution in Q-Sort Procedure 








No. of 


Interpretation Category 








Most like me 11 6 
10 9 


= 
is 
os 


KNW aan oe 
—_ 
_ 


Least like me 





for any single individual with his own replica- 
tions and with the description of the self- 
ideal. Furthermore, since a score in the range 
between 1 and 11 points is assigned each 
item by placing it in one of these 11 cate- 
gories, a positive self-concept score may be 
obtained from each sorting by summing the 
scores assigned the 60 positive items. 

In a test-retest reliability study, 120 col- 
lege students described themselves twice, with 
the sortings separated by an interval of one 
week. Mean of the product-moment correla- 
tions of individual items for the two sortings 
for the 120 subjects was .79 + .02. The re- 
liability coefficient for the positive self-con- 
cept scores obtained by summing points as- 
signed the 60 positive items was .95 + .03. 

In testing the hypothesis that changes simi- 
lar to those reported in psychotherapy would 
be found for persons engaged in self-intro- 
spection by way of repeated self-description, 
20 undergraduates and 6 graduate students 
were used as subjects. Four of these were 
adults in evening extension classes. The seven 
subjects in the intensive self-concept group 
made ten self-sorts in five days, two per day. 
Fifteen subjects in the self and self-ideal 
group also made repeated self-descriptive 
sortings in a five-day period, with two or 
three self-ideal Q sorts interpolated. In the 
group of four used for study of changes over 
a longer period of time, three made ten self- 
sorts at irtegular intervals over periods vary- 
ing from «ne to three months in length, while 
one subject completed 21 sortings in seven 
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different environmental contexts over seven 
and one-half months. 


Results and Discussion 


Positiveness of self concept. With the in- 
strument employed in this research, a positive 
self-concept score in the range 223-497 points 
may be obtained by placing the 60 positive 
items in the 11 available categories. If posi- 
tive and negative items are assigned equal 
importance in the self-description, the posi- 
tive score is 360 points with an equal total 
for the negative items. For the 120 subjects 
of the reliability group, the mean positive 
score on the first sorting was 421.25 points, 
with a range of 262 to 492, and 105 subjects 
(88%) scoring themselves positively or above 
the 360-point neutrality total. The mean for 
the second sorting a week later was 426.78, 
a mean gain of 5.63 points, statistically sig- 
nificant at the 1% level (Table 2). 

The gains were substantially greater for the 
subjects who made repeated self-sortings in 
the brief period of five days. The mean in- 
crease in positiveness for the seven subjects 
describing themselves twice daily for five 
consecutive days was 23 points, a change sig- 
nificant at the 2% level, even for this small 
N. For the 15 subjects interpolating self-ideal 
descriptions within the self-concept replica- 
tion series, the mean increment was 17 points, 
again significant at the 2% level, using in 
each case the one-tailed test of significance 
appropriate where the change is in the direc- 
tion predicted. For the four subjects whose 


Table 2 


Increase in Positive Self-Concept Score with Repeated 
Q-sort Self-Description 











Mean 
First Last Differ- 
Group N Sorting Sorting ence 

Reliability 

group 120 421.25 426.78 5.63* 
Intensive 

self-sorts 7 433.00 455.86 22.86** 
Self and self 

ideal sorts 15 431.00 447.93 16.93** 
Scattered 

self-sorts 4 436.75 439.50 2.75 





* Significant at 1% level. 
** Significant at 2% level. 
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self-sorts were scattered over periods of one 
to seven and one-half months, with temporal 
intervals more like those found in research in 
which the Q sort has been used to follow 
changes taking place in therapy, the mean 
gain was slightly less than three points 
(Table 2). 

It would appear, then, that self-introspec- 
tion of the kind induced by repeated Q-sort 
self-description in a short time interval tends 
to be accompanied by increased positiveness 
of self-attitude, like that reported for therapy. 
The change would seem, however, to be of 
much smaller magnitude than that reported 
in counseling. The gain in positiveness of self- 
concept adjustment score noted by Dymond 
(2) for 25 subjects receiving psychotherapy 
was approximately 38%, as compared with 
an 11% increase in positiveness for the sub- 
jects of the intensive self-sort group in this 
research, and 8% for the self and self-ideal 
group. 

Relationship between self concept and self- 
ideal, Fifteen subjects in the experimental 
group made repeated self-concept sortings 
with self-ideal Q sorts interpolated within 
the series and at the conclusion. All of these 
subjects sorted the items to describe the ideal 
self at least twice, once following the first 
self-sort, and once at the conclusion of the 
series on the fifth day. 

The mean correlation between the initial 
self-description and the first self-ideal sort- 
ing was .52, which would seem to indicate a 
substantial relationship (Table 3). But the 
range for individual subjects was from .04 
to .79, reflecting wide individual differences. 
Furthermore, the mean positive score for the 
self-ideal was 493 points, exceeding the mean 
for the self concept (424) by 69 points, a 
difference significant at the 1% level. Thus 
though there seems usually to be a positive 
relationship between self concept and self- 
ideal, there is also usually a significant dif- 
ference. 

With replication of descriptions of self and 
self-ideal over the five-day period, the mean 
positive correlation between self-concept and 
ideal self increased from .52 to .66, a gain 
significant at the 1% level for the 15 non- 
counseled subjects. This is the trend reported 
in successful psychotherapy, as illustrated in 
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Table 3 


Change in Relationship Between Self Concept 
and Self-Ideal 


Correlation 


First 





Group or Final 
individual sorts sorts 
Experimental group 52 66* 
(N = 15) 
Rah G4 —.18 
Mrs. Oak 21 69 
Zar 18 81 
* Difference significant at 1% level 


Rogers’ case of Mrs. Oak (9) with the posi- 
tive relationship between self and self-ideal 
rising from .21 to .69, and in Hartley’s case 
of Zar (3) with the correlation increasing 
from .18 at the beginning to .81 at the con- 
clusion of counseling. The increment is much 
smaller for members of the experimental 
group than for the two counseling cases, how- 
ever, with the greatest gain for a subject not 
receiving counseling being from .43 to .74 in 
five days. The changes reported in counseling 
took place, of course, over a period of months. 
For one subject in the experimental group, 
Rah, the only one with an initially negative 
self-concept score of the type often found in 
persons seeking therapy (330 points), the re- 
lationship actually dropped from .04 on the 
first pair of sortings to — .18 for the last pair, 
when the positive score had fallen to 315 
points. 

Thus it appears that repeated interacting 
introspection on the self and the ideal self 
tends to lead to increased positive relation- 
ships like those reported in therapy, though 
of lesser magnitude. And as in therapy, the 
description of self tends to be altered more 
than the self-ideal, though changes took place 
in both for all subjects. 

Consistency of the self concept. Brownfain 
(1) has reported a significant positive cor- 
relation between consistency of self concept, 
and adjustment as indicated by inventory 
scores and social relationships. Rogers (9) 
has noted an increase in consistency of self- 
description in the case of Mrs. Oak, not only 
through the therapy period but also in fol- 
low-up studies, from an initial correlation of 
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.50 for the 
last pair. 
For the experimental subjects in this re- 
search project making ten self-sorts in five 
days, the mean correlation of item scores for 
the two setf-descriptions on the first day was 
.88. The mean correlation for the last pair of 
self-concey:t sortings on the final day was .98. 
This ne 55 is significant at the 1% level. 


rst two self-sorts to .70 for the 


For the $5 subjects making both self and 
self-ideal sortings, the increment was from a 
mean of .79 for the first pair of self-sorts to 
.88 for the final pair, also significant at the 
1% level (Table 4). The change for the four 
subjects who described themselves at irregu- 
lar intervals over longer periods of time was 
from a mean of .80 to .88. Of these 26 sub- 
jects, 24 showed steady increases in self- 
consistency. 

Apparently, then, increased consistency of 
self-description and presumably of self con- 
cept is to be anticipated with replication 
of description of self, whether intensive or 
spaced over relatively long time intervals, and 
whether psychotherapy is provided or not. 


Summary and Conclusions 


To determine whether changes in the self 
concept similar to those reported in successful 
psychotherapy may take place with replica- 
tion of self-description by noncounseled sub- 
jects, repeated Q-sort descriptions of the self 
were obtained from 26 persons. In addition, 
15 of the subjects interpolated self-ideal sort- 
ings within the series of self-sorts. For all 
conditions employed, the same general trends 
were found that have been reported for suc- 
cessful counseling. The self concept tends to 


Table 4 


Increase in Consistency of Self-Description 
with Replication 








Mean correlation 








Group or First Final 
individual N sorts sorts 
Intensive self-sorts 7 88 .98* 
Self and self-ideal 
sorts 15 .79 .88* 
Scattered self-sorts 4 80 88 
Mrs. Oak 50 .70 





* Difference significant at 1% level. 
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become more positive and more consistent, 
and its relationship to the self-ideal becomes 
more positive. In all cases, however, these 
alterations were of smaller magnitude than 
those reported in counseling. 

The data obtained suggest these conclu- 
sions: 


1. Intensive self-introspection without ther- 
apy appears to be accompanied by increased 
positiveness of attitudes toward the self, at 
least for most subjects with initially positive 
self-attitudes. The increment, however, is usu- 
ally significantly smaller than that reported in 
successful counseling cases. 

2. Repeated description of the self and 
self-ideal is usually accompanied by increased 
positive relationship between the two, even 
without psychotherapy. The change is smaller 
than that reported for persons in therapy. 

3. Repeated self-description without coun- 
seling usually is accompanied by increased 
consistency of self concept. 

4. Self-introspection by _ self-description 
without therapy may be accompanied by 
some of the changes reported in successful 
counseling, which presumably also involves 
rather intensive introspection on self. Use of 
self-descriptive techniques, e.g., the Q sort, 
while therapy is in progress, may then actu- 
ally facilitate some of the alterations said 
to be characteristic of successful counseling. 
These techniques do not appear to produce 
changes of the magnitude reported for psy- 
chotherapy, however. 

5. Significant increases in positiveness of 
self concept, and in positive relationship be- 
tween the self and self-ideal, may be valid 
indexes of improvement wrought by therapy, 
but increased consistency of self concept is 
achieved so readily by self-description with- 
out counseling that it would seem a dubious 
criterion, especially when self-inventories or 
Q sorts are used in conjunction with therapy 
for evaluation or other purposes. 


Received September 20, 1954. 
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In a previous paper, it was noted that at 
least three selves could be conceptualized. 
These were: “(a) The objective self, defined 
in terms of his actual potentialities, capaci- 
ties, and the like, i.e., the objectively meas- 
ured self; (5) The self concept, the indi- 
vidual’s understanding or evaluation of these 
potentialities and capacities; (c) The social 
self, the operation of the potentialities and ca- 
pacities in relation to his social environment”’ 
(1, p. 307). This previous paper further 
noted that “It may be assumed that the 
three levels would be related, but disparity 
would be possible between all three levels. 
Unfortunately . . . direct comparison of the 
self levels from our present data are [sic] 
not possible” (1, p. 307). 

The present paper concerns itself with these 
interrelationships between the three selves de- 
scribed above. Specifically, measures were ob- 
tained as estimates of these three selves and 
the interrelationships are reported and dis- 
cussed. 

Procedure 


Four sections of Naval Aviation Cadets 
with N’s of 29, 24, 20, and 22 were the sub- 
jects of this experiment. Ratings were ob- 
tained from these subjects after they had 
been members of their section for 4 weeks 
and 15 weeks. During this time, each mem- 
ber of each section had had constant and 
close contact with every other member of his 
section. They had lived, worked, slept, and 
eaten together almost daily. 


1 This study was performed as part of ONR Con- 
tract NR 154-098. 

2 Opinions or conclusions contained in this paper 
are those of the author. They are not to be con- 
strued as necessarily reflecting the view or the en- 
dorsement of the Navy Department. 


The traits on which ratings were obtained 
at these periods were defined on each rating 
scale as follows: 


Leadership: Could take over and get others to go 
along with him, could direct a group in accomplish- 
ing a goal, could act as a leader. 

Social adequacy: Ability to get along with other 
people; friendly; liked by the members of his group. 

Intelligence: A “bright guy”; original; ability to 
meet new situations effectively; catches on quickly; 
can use his head; insightful. 

Possibility of success in flight training program. 

Possibility of success as a Naval Aviator. 


For each trait a rating from one to seven 
was possible with the extremes labeled as 
“least” and “most.” The cadets were in- 
structed to “normalize” their ratings, that is, 
they were told to rate the relative standing 
of their own group rather than rating against 
a general population. As such, they were told 
that they should give about as many low rat- 
ings as high ratings. 

Each member of the section was given a 
booklet of rating scales which were numbered 
from 1 through the number in his section. 
The cadet was also given a roster of his sec- 
tion in which the names had been assigned in 
a random order to numbers 1 through NV. The 
cadet was told to look at the number on the 
rating form, to look at the name appearing 
by this number on the roster, and to proceed 
to rate that person on the five traits of the 
scale. He also was told that one of the num- 
bers would appear by his own name. He was 
to rate himself on these traits and to circle 
that number on his rating scale. 

At the completion of the ratings in the 15- 
week session, the Otis Quick-Scoring Mental 
Abilities Test (Form Am) was administered 
to each group. 
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Results 


In this paper we have concerned ourselves 
with only one of the traits of the rating scale, 
that of intelligence. The correlations between 
the three measures of intelligence obtained at 
the 15-week session were: 


Group rating and self-rating 43 
Group rating and Otis 49 
Self-rating and Otis 21 


The group rating was the mean of all of the 
ratings given to an individual by his peers; 
the self-rating was that rating which the in- 
dividual assigned to himself; the Otis rating 
was his score obtained on the Otis test of in- 
telligence. 

In addition, it was found that the correla- 
tions between the ratings obtained at the 
4-week session and at the 15-week session 
were: 


Group rating (4 week) and group rating 


(15 week) 76 
Self-rating (4 week) and self-rating (15 
week) 19 


An analysis of variance estimate of the reli- 
ability of the group ratings was .86 at the 
4-week session and .89 at the 15-week session. 


Discussion 


In regard to the group ratings, it is quite 
clear that the magnitude of these correlations 
is dependent primarily upon the number of 
judgments obtained rather than some such 
factor as the “objectivity” of the rater. It is 
to be noted that if only one judge was being 
used, in contrast to approximately 25 judges 
in each group, correlations of the order of 
those obtained on the self-ratings would have 
been expected. These correlations could be 
estimated by considering each judgment an 
item, and inverting the Spearman-Brown for- 
mula and estimating the reliability of a “test” 
one—twenty-fifth as long as that used. The re- 
sultant reliability would be about .11. Simi- 
larly, one could estimate the validity of the 
group ratings using one—twenty-fifth of the 
judges by applying the formula: 


V/ 1 — T reliability 
n 





Estimated f-riterion = + Tretiability- 
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In this formula, .04 would be used for the n. 
In this case, the correlation between single 
ratings of a group member and intelligence 
would be about .18. 

Further, it is possible to surmise that the 
low reliabilities and perhaps the low validity 
of the self-ratings were related to this factor 
of “test length.” The demonstrated reliability 
of personality inventories (which involve self- 
evaluation but also involve a large number 
of items) in comparison to the presently ob- 
tained reliability of .19 would support such a 
hypothesis. Certainly, some such hypothesis 
is preferable to the assumption that the in- 
dividual’s “self concept” is as unstable as the 
data may suggest. 

With these considerations the reported cor- 
relations lend themselves to the following in- 
terpretations. To the extent to which the “so- 
cial self’ may be conceived of as the com- 
bined group judgments of an individual by 
his peers, this self showed a substantial rela- 
tionship to an objectively measured self on 
the dimension of intelligence. Further, to the 
extent to which a self-statement may be con- 
sidered related to the “self concept” a re- 
markably low relationship exists between this 
self and an objective measure of the trait in 
question. 

We must, however, point to the limitations 
inherent in these statements. In a sense the 
measure of the social self is a statistical arti- 
fact. The quite discrete individual judgments 
of the group members are not summed and 
averaged except perhaps within each indi- 
vidual as he responds to these separate judg- 
ments of his peers. Only in voting behavior 
where opinions are summed do we have a 
case in which group judgments are actually 
summed. As for the single rating reflecting 
the “self concept” we have noted that if we 
consider this self concept to be a fairly stable 
state the method here did not show such a 
stability. Perhaps to get a true picture of the 
self concept we must obtain successive judg- 
ments or multiple judgments related to the 
trait in question (similar to the personality 
inventory). 

A still further limitation should be noted. 
We have limited ourselves to peer judgments. 
These ratings, placed in the matrix of the 
total social group of an individual composed 
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of not only peers but superiors and inferiors, 
may reflect quite a different picture. It is 
quite likely that the actual “selves” of an in- 
dividual are markedly affected by the varying 
portions of superiors, peers, and inferiors in 
his operating environment. 


Summary and Conclusions 


Four groups of Naval Aviation Cadets who 
had lived together for 15 weeks rated each 
other and themselves on a number of traits 
including intelligence. The Otis test of in- 
telligence was then administered to these 
individuals. Interrelationships between these 
measures and ratings obtained after 4 weeks 
were computed. It was found that the aver- 
aged group ratings were reliable and corre- 
lated substantially with the objective meas- 
ures of intelligence. The self-ratings appeared 
to be highly unreliable and exhibited only a 
limited relation to intelligence. It was noted 
that the reliability and possibly the validity 
of these relationships were considerably de- 
pendent upon the number of judgments ob- 
tained. That is, the group ratings were an 
average of individual ratings which were no 
more reliable than the self-rating. It was fur- 
ther noted that the group ratings were a sta- 


tistical phenomenon and not an actual thing. 
Further, it was noted that our findings were 
within a peer context and the effect of pres- 
ence of superiors and inferiors on these rat- 
ings was not known. In spite of these limi- 
tations, and there are probably others, two 
empirical facts could be stated: 

1. A group of peers can fairly well “clas- 
sify” its members on intelligence. The way a 
person performs in relation to his peers re- 
sults in a reliable and valid “social self” in 
regard to intelligence if this social self can be 
defined as the summation of the group’s atti- 
tude toward the person. 

2. An individual’s single statement about 
his intelligence is likely to have little rela- 
tionship to his measured intelligence at that 
time in a group of peers. Stated otherwise, if 
a self-statement is related to the “self con- 
cept” of intelligence this self concept is likely 
to be highly unrealistic in relation to objec- 
tive measures. 


Received September 21, 1954. 
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MMPI Validity Scores as a Function 
of Increasing Levels of Anxiety 
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In an earlier study it was noted, “Our work 
with the Taylor (anxiety) scale has led us to 
wonder if the individuals in the low anxiety 
group (our group I) are not im fact equal in 
anxiety level to the Ss in the high anxiety 
groups but are denying (by their answers to 
the Taylor) this anxiety. That is, they might 
be equal in level of anxiety to the other 
groups but are repressing or suppressing this 
fact” (9). However, the finding that low anx- 
ious and high anxious Ss earned different 
learning scores (trials) in a maze learning 
task suggested that the two groups might 
truly be different and that denial was not 
operating in the low group. 

The present study represents an attempt to 
investigate this question further. The Taylor 
Scale of Manifest Anxiety (13) is a ques- 
tionnaire-type inventory, derived from the 
MMPI, containing 50 critical items which 
constitute the anxiety scale and 175 filler 
items. Fortunately, for our purposes, these 
175 filler items contain the three validity 
scales, L, F, and K, of the MMPI (3), which 
are described in more detail below. The aim 
of the present study was to determine the 
strength of the relationship between scores on 
these three validity scales and level of anx- 
iety as judged from the individual’s score on 
the Taylor scale. 


Procedure 


The Ss used in the first part of the study 
were 119 male medical freshmen and sopho- 
mores who had served in an earlier study 
(11). On the basis of their Taylor raw scores 
the 119 medical students were assigned to five 
groups on a scale of increasing anxiety level. 
Individuals with raw scores of 1-5 were given 


an anxiety rating of 1, while those with raw 
scores of 6-10, 11-15, etc. were given ratings 
of 2, 3, 4, and 5, respectively. The number of 
Ss in each of these five groups from low to 
high was 18, 41, 34, 15, and 11, respectively. 
The findings with this group were cross vali- 
dated the following year with a group of 31 
female occupational therapy (O.T.) juniors. 
The Taylor scale was group administered in 
both samples. 


Results 


In Fig. 1 is shown the relationship for the 
medical students between the ZL score and 
anxiety level as measured by the Taylor 
scale. The strength of this relationship is a 
moderately negative one; the value of the 
Pearson r is — .32, a value significant at the 
01 level of confidence. Values of eta 
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Fig. 1. Mean Lie-scale (LZ) score for each of the 
five groups ranging from low anxiety (group 1) te 
high anxiety (group 5). 
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Fig. 2. Mean F-scale score for each of the five 
groups ranging from low anxiety (group 1) to high 
anxiety (group 5). 


epsilon, computed from the analysis of vari- 
ance data, yield similar results, .36 and .32, 
respectively. Subjects in group 1 earn a Lie 
score mean of approximately 5.00 while those 
in group 5 earn a mean of 3.00. Thus there is 
a moderate tendency for those Ss scoring low 
on the Taylor scale to be less candid (as in- 
ferred from their score on the LZ scale) than 
individuals in the higher anxiety ranges. This 
relationship was even more striking with the 
O.T. sample. The value of r was found to be 
— .52 with this female sample. The means 
ranged from 5.71 in group 1 to 2.40 in 
group 5. 

The relationship between score on the F 
scale * and anxiety level for the medical stu- 
dents is shown in Fig. 2. The strength of 
this relationship, significant at the .01 level, 
is greater than that of L, as seen from the r 
value of .46 (eta of .50 and epsilon of .46). 
The means vary from 3.72 in group 1 to 
10.73 in group 5. Again this relationship 
tends to be stronger in the O.T. sample 
(r of .62 and approximately similar mean 
values). From these findings it can be con- 
cluded that there is a moderate tendency for 


1 Two of the 64 items of the F scale were appar- 
ently not included in the Taylor scale but this is 
considered relatively negligible for our purposes. Also 
there is an overlap of three items between the Taylor 
and the F scale. This indicates that the obtained r 
between these two measures is slightly greater than 
if this overlap did not exist. 
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whatever it is that is measured by the F scale 
to be correlated with the Taylor index of 
anxiety level. 

In Fig. 3 is shown the relationship, again 
for the medical students, between anxiety 
level and score on the K scale. The magni- 
tude of the correlation, r of — .71 (eta of .69 
and epsilon of .71), indicates a very high re- 
lationship (.01 level) between these two meas- 
ures. The mean values of K vary from ap- 
proximately 20 in group 1 to 11 in group 5S. 
The correlation is again stronger in the O.T. 
sample: r of — .84, and means ranging from 
22 to 10. This very high relationship between 
the K-scale index and score on the Taylor 
scale would indicate that one’s anxiety level 
as inferred by his Taylor score reflects his 
“test-taking attitude” as well as whatever else 
it is the Taylor scale is sampling. 


Discussion 


The correlations obtained between the Tay- 
lor questionnaire and each of the three va- 
lidity scales raise a number of interesting 
questions. The interpretation of these corre- 
lations would be less complicated if the three 
validity indices represented “independent” 
measures and were not themselves composed 
of items from the 550-item MMPI. We are 
faced here with the problem of interpreting 
correlations between two scales (the Taylor 
and each of the three validity scales) when 
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Fig. 3. Mean K-scale score for each of the five 
groups ranging from low anxiety (group 1) to high 
anxiety (group 5). 
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both of these scales are composed of “per- 
sonality-tapping”’ questionnaire-type items 
from a single population of items. A way out 
of this dilemma is to pay less attention to 
what it is these scales are purported to meas- 
ure and to think in terms of the operational 
procedures involved in their original develop- 
ment and standardization. 


The Taylor scale consists of 50 items, chosen out 
of a total of 200 MMPI items, upon which § cli- 
nicians showed 80 per cent agreement that these 
were indicative of manifest anxiety as the latter was 
described by Cameron. The literature of the past 
four years indicates that scores on this scale are as- 
sociated with differences in conditioning and learn- 
ing ability as demonstrated in numerous studies (8), 
intelligence as measured by a timed test (10), and 
also that these scores can differentiate, fairly effi- 
ciently, medical from psychiatric patients (12). The 
Lie scale (L) is composed of 15 items of the follow- 
ing sort: I would rather win than lose in a game; 
at times I feel like swearing; I do not always tell 
the truth; I gossip at times; etc. The authors of 
the MMPI found empirically that a large, repre- 
sentative sample of “normals” will usually answer 


“ves” to these and the other similar items making: 


up the scale. The F scale was similarly developed 
and consists of 64 items which are answered in the 
infrequent direction less than 10 per cent of the 
time by these same “normals.” A subject’s score on 
the F scale (3) has been found empirically to be a 
function of carelessness in taking the test, inability 
to comprehend the items, errors in recording on the 
answer sheet, the presence of psychiatric illness (1, 
6), and extreme individuality (in the sense that it 
picks up those few people who are highly individual 
and independent and who honestly score those items 
which are infrequently scored by “normals”). The 
K scale, a 30-item scale purported to assess defen- 
siveness or lack of it in a subject’s “test-taking atti- 
tude,” was empirically derived by studying the item 
response frequencies of certain diagnosed abnormals 
whose MMPI pattern yielded essentially normal pro- 
files. It was assumed by the MMPI authors that the 
occurrence of a normal profile in a known psychi- 
atric patient was suggestive of a defensive attitude in 
this patient’s response behavior. The response fre- 
quencies of these “normal appearing” patients were 
contrasted with those from an unselected sample of 
people in general (“true normals”). The items which 
differentiated these two groups were then scored so 
that a high K score would be found among known 
abnormals with normal curves, whereas a low score 
would be found in clinical normals having deviant 
curves. The authors then identify the trait sampled 
by the K scale by suggesting that, in this operational 
sense, it can be said that a high score is indicative 
of a defensive attitude, and a low K score suggests 
unusual frankness or self-criticalness (7). 

The authors of the MMPI consider all three of 
these scales in two ways: (a) as validating scales, 
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the scores of which will help the clinician determine 
for any subject whether or not his responses were 
unduly influenced by his test-taking attitude, and 
(b) as an index of certain aspects of his personality, 
ie., if he rejects by his test responses all those items 
that imply personal faults or personality difficulty 
then this very rejection (reflected in the validity- 
scale scores) tells us something about the kind of 
person he is (3, p. 23). 


It is in regard to the second interpretation 
of the validity scales that the results of the 
present study are pertinent. The correlations 
obtained between the Taylor scale and each 
of these three measures serve the purpose of 
bringing together two separate streams of re- 
search—studies growing out of hypotheses 
based on learning theory and clinical experi- 
ence (Taylor scale) are thus related to the 
many personality theory studies which have 
grown out of the parent instrument, the 
MMPI.’ If one brings together the results of 
these two independent streams of research, it 
would appear that Ss who score low on the 
Taylor scale and who have been shown to be 
superior in learning ability (to Ss whose anx- 
iety level is higher) on some tasks, and to be 
inferior on still others, and who earn higher 
scores on the ACE test of intelligence, etc., 
are also individuals who are not as frank and 
are probably more defensive in their test-tak- 
ing behavior. Individuals who score in the 
middle range or at the high end on the Taylor 
scale can be similarly understood as progres- 
sively less defensive. Thus, in the sense of the 
second usage of the validity scales, the cor- 
relations obtained in this study enable us to 
know more about the characteristics of an in- 
dividual subject than would be possible from 
knowledge of his Taylor score or his scores 
on the validity scales alone. The implications 
for the Taylor scale, and studies based on 
it, of these findings (especially the lack of 
“frankness” in the low “anxious” groups) 
will become clear only after further research. 

These data would suggest, however, that 
the superiority of the high anxious group 
(Taylor scale) in numerous conditioning 


2It is of interest to recall that the Taylor scale 
represents .tems which clinicians judged indicative of 
manifest anxiety, while the three validity scales are 
composed of items which empirically have separated 
groups (patients from “normals,” malingerers from 
nonmalingerers, etc.) and in which clinical judgment 
played no a priori role. 
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studies reported to date is as much a func- 
tion of their “lack of defensiveness” in test- 
taking attitude as it is a function of their 
anxiety level. The performance of the middle 
and low anxious groups can be similarly in- 
terpreted. In this connection it is of interest 
that Baron * has found a “striking curvilinear 
relationship” between the MMPI K score 
and verbal learning. The high correlation 
found between K and anxiety scores would 
thus indicate that Baron’s results confirm our 
earlier finding of a significant curvilinear re- 
lationship between anxiety score and stylus 
maze learning (9). The significant correla- 
tions found in the present study raise some 
other interesting questions: Is the Taylor 
scale a measure of “anxiety,” or is it a meas- 
ure of “defensiveness,” or lack of “candid- 
ness”? Are the K and L and F scales also a 
measure of anxiety? Or is it possible that all 
four scales are tapping a common factor, a 
factor we might tentatively call “level of mal- 
adjustment.” Further research will be needed 
to answer these questions, although it is of 
interest that the Taylor scale (as well as the 
validity scales) can differentiate, reasonably 
efficiently, a psychiatric from a nonpsychi- 
atric patient sample (12). 

The findings of the present study are also 
of interest for the MMPI user. The grand 
mean values for all 119 medical student Ss of 
this study (independent of Taylor anxiety 
level) on the L, F, and K scales were 3.71, 
5.17, and 16.87, respectively. These are mean 
values which are clearly in the established 
limits considered “normal,” since raw scores 
of 7, 17, and 23, respectively, are necessary 
to earn a T score of 70 or above. The curves 
shown in Fig. 1, 2, and 3 therefore represent 
a finer analysis of validity scores (heretofore 
considered “normal”) along a dimension of 
anxiety. Thus the correlations permit a re- 
finement of the “meaning” of the validity 
scores. 

The question can be raised whether the 
demonstrated relationships between the items 
making up the validity scales and those mak- 
ing up the Taylor scale are exact ones since 
the three validity scales, imbedded in the 


8 Baron, M. R. Personal communication, October 
1953. Professor Baron also found a high r between 
K and anxiety. 
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225-item Taylor questionnaire, might be re- 
sponded to differently than if the full 550- 
item MMPI were administered. While this 
question must be left open, as it was not spe- 
cifically investigated in this study, it is felt 
that the effect on the demonstrated correla- 
tions, if any, would not be serious. 


The results of several recent publications are also 
of interest in relation to our findings. Heineman (4), 
working in the Iowa laboratory where the Taylor 
scale was developed, set out specifically to develop 
a forced-choice form of the Taylor inventory in or- 
der to reduce the effects of the desire of Ss to ap- 
pear in a favorable light and so earn by their re- 
sponses lower Taylor anxiety scores. His study was 
prompted by several findings at Iowa, one of which 
was a heretofore unpublished correlation of — .74 
between the anxiety scale and the K scale of the 
MMPI. The latter is comparable to the correlations 
found in the present study. By use of his forced- 
choice technique, Heineman was able to reduce the 
correlation between Taylor score and K to approxi- 
mately — .55 when he scored his new scale in one 
way and — 40 when he scored it by a second key. 
In his review of the Taylor scale, Mandler (8) re- 
ports two studies from the Harvard laboratory. In 
the first place, Ericksen found a correlation of .72 be- 
tween psychasthenia-hysteria (MMPI hysteria score 
minus psychasthenia score) and the Taylor, with 
high anxiety scores related with psychasthenia. In 
the second, Ericksen and Davids report a rank-order 
correlation of .92 between the MMPI psychasthenia 
scale and the Taylor score, and a rank-order r of 
— .89 with the hysteria-psychasthenia measure. These 
authors also report an interesting correlation of — .90 
between the Taylor scale and rankings on the trait 
of “optimism-pessimism.” They suggest that on the 
basis of clinical appraisal, high and low scoring Ss 
on the Taylor scale also show differential defense 
mechanisms and self-attitudes. This suggestion is 
clearly similar to the finding which prompted our 
own study. Hovey, in a clinical study using psy- 
chiatric patients, found that his 60 Ss who were 
diagnosed “anxiety reaction” earned the lowest scores 
on the K scale and the highest on the F scale (5). 
These patients thus were similar to the “normal” Ss 
in our high anxiety group, group 5, who also scored 
lowest on the K scale and highest on the F scale. 
Deese, Lazarus, and Keenan (2) report a correla- 
tion of 81 (reduced to 40 when overlapping items 
were excluded) between the Taylor and the MMPI 
psychasthenia scale. All of these studies suggest that 
there are many interesting relationships between 
these MMPI clinical scales and anxiety as reflected 
in the Taylor score (or diagnosed clinically, as in 
the study by Hovey). 


One further point is worth noting. The 
scores of research papers utilizing the Taylor 
scale published since this crude index of anx- 


MMPI Validity Scores and Levels of Anxiety 


iety was introduced in 1951 are an indication 
of the eagerness with which psychological sci- 
ence awaited a reliable, and, on the face, 
valid index of anxiety. The high correlations 
(— .71 and — .84) obtained in this study be- 
tween the Taylor and K scale would indicate 
that the latter may be used as an alternate 
form of the Taylor (with appropriate con- 
sideration of the negative direction of the 
correlation) in research requiring two forms 
of the scale. These correlations between K 
and the Taylor scale are not much lower than 
the two reliability coefficients of the latter 
(.81 and .89) reported by Taylor (13). 

The trend for the all-female O.T. sample 
to show consistently Aigher correlations be- 
tween the Taylor scale and each of the three 
validity scales, as compared to the all-male 
(possibly more intelligent) medical student 
sample, is difficult to interpret. Further re- 
search may offer an hypothesis to explain this 
observation. 


Summary 


The Taylor anxiety scores of 119 male 
medical students were correlated with their 
scores on the three validity scales (L, F, and 
K) of the MMPI. The obtained correlations 
were — .32, .46, and —.71, respectively. 
These values were all significant at the .01 
level of confidence. These findings were cross 
validated on a sample of 31 female occupa- 
tional therapy students. There was a con- 
sistent trend for all correlations to be higher 
in this second sample. It was suggested that 
these correlations serve the purpose of bring- 
ing together two heretofore separate streams 
of research—Taylor scale studies growing out 
of hypotheses based on learning theory and 
clinical experience, on the one hand, and per- 
sonality theory studies which have grown out 
of the parent instrument, the MMPI, on the 
other. Questions regarding the interpretation 
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of what each of these scales “taps” 
raised. 


were 
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3 Bellevue Subtest Performance’ 
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Since the Taylor scale is an objective and 
apparently valid measure of anxiety, it seemed 
that here was a way, free of the usual haz- 
ards of subjectivity and bias, for selecting 
groups on which to test some common clinical 
assumptions about the subtest pattern of the 
Wechsler-Bellevue to be expected as a result 
of anxiety. 

Wechsler does not report the characteristics 
or size of his neurotic group but, apparently 
on the basis of clinical impression, describes 
it as tending to be particularly low on the 
Digit Span and Object Assembly subtests. In 
their two volumes Rapaport, Gill, and Schafer 
reported a similar pattern based, however, on 
an Anxiety and Depression group consisting 
of only ten patients. 

The subjects for the present study were 80 
medical and psychiatric hospitalized, white, 
male patients between the ages of 18 and 37. 
The subjects were administered the Wechsler- 
Bellevue and asked individually to fill out 
the Taylor scale. Raw scores on the Taylor 
scale were coded into five anxiety groups, 
Taylor scores from 1 to 8 receiving an anx- 
iety rating of one, those from 9 to 16 a rat- 
ing of two, and so on. The number of subjects 
within each group from 1 to 5 were 7, 17, 16, 
21, and 19, respectively. F tests indicated 
that the groups were equated for age, educa- 


1An extended report of this study may be ob- 
tained without charge from Ruth G. Matarazzo, 
Dept. of Neuropsychiatry, Washington University 
School of Medicine, St. Louis, Mo., or for a fee 
from the American Documentation Institute. To ob- 
tain it from the latter source, order Document No. 
4523 from ADI Auxiliary Publication Project, Photo- 
duplication Service, Library oi Congress, Washing- 
ton 25, D. C., remitting in advance $1.75 for micro- 
film or $2.50 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


tion, and IQ, the mean age being 29, mean 
education 10th grade, and mean IQ 105. 

By means of the F test the performance of 
the five groups was compared on each of the 
eleven subtests of the Wechsler-Bellevue and 
no relationship was found between anxiety 
level and subtest performance. The use of a 
correlational technique (epsilon) confirmed 
this lack of relationship; no one of the eleven 
epsilon values reached significance. 

Since the group means might be obscuring 
true differential patterning effects or “scatter” 
among the various subtests for individual 
patients, two measures of individual scatter 
analysis were used: (a) range; and (5) de- 
viation from the mean (a count was made of 
the number of subtests which differed by three 
or more weighted score units from the indi- 
vidual’s mean weighted subtest score). Neither 
of these two measures bore a relationship to 
anxiety level. 

Comparing the upper and lower 20% of 
the anxiety continuum, as has been done in 
most studies with the Taylor scale, did not 
change the negative results. Eliminating sub- 
jects with IQ less than 100 also produced no 
change. 

Assuming that the Taylor scale measures 
the same thing which clinicians mean when 
they speak of anxiety, the present results must 
lead us to carefully question the validity of 
the diagnostic signs described by Wechsler, 
Rapaport, and others. The present results are 
perhaps surprising in view of the long and 
widely held belief among clinicians that digit 
span especially, and perhaps object assembly 
to a less extent, are vulnerable to anxiety. 


Brief Report 
Received December 21, 1954. 
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The Relation of Manifest Anxiety to Association 
Productivity and Intellectual Attainment’ 
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Within Hull’s theoretical system all habit 
tendencies activated by a given stimulus are 
considered to be multiplied by the total drive 
then operating. This proposition has been 
subjected to a number of recent experimental 
tests. A popular technique has been to employ 
a personality scale of manifest anxiety (8) to 
provide a measure of total drive strength. 
Performance of Ss selected on the basis of 
high or low scores on this scale has been com- 
pared on such measures as eyelid conditioning 
(2, 5, 6, 7), verbal learning (3, 4, 9), and 
various other more complex tasks (1, 10, 11). 
In general, the results have tended to support 
predictions derived from Hullian theory. 

This theoretical formulation holds two some- 
what related predictions for performance 
measures. The higher the total drive state, 
the more likely is the occurrence of the strong- 
est response tendency, but, also, with high 
drive states, the greater is the total number 
of response tendencies that are above thresh- 
old. The studies on eyelid conditioning have 
rather directly verified the effect of drive 
level upon a single dominant response, and 
the verbal learning studies have indirectly 
confirmed the related prediction that with 
high drive a greater number of response 
tendencies would be above threshold. In these 
latter studies the support has been indirect 
since it has come from the deduction that 
high drive leads to a greater number of com- 
peting responses which, in turn, differentially 
interfere with learning depending upon the 


1This investigation was facilitated by research 
grants from the Harvard Laboratory of Social Rela- 
tions, the Rockefeller Foundation, and the National 
Institute of Mental Health (Grant M-700), Public 
Health Service. 


degree of learning (relative strength of the 
correct response) that has occurred. 

It was our first purpose in the present ex- 
periment to provide a more direct test of the 
deduction that a high drive level would lead 
to a greater number of responses being supra- 
threshold. From this deduction, it would be 
expected that scores on the manifest anxiety 
scale would be directly related to the number 
of associations given in response to stimulus 
words on a chained word association test. 

A second purpose in this study was to in- 
vestigate relations between scores on the anx- 
iety scale and two measures of intellectual 
ability. Although it is common practice to 
select subjects from a college population on 
the basis of their anxiety scores and to com- 
pare the learning rate of “anxious” and “non- 
anxious” groups, the question of whether this 
selection procedure introduces ability or in- 
tellectual differences is as yet unanswered.? 
In the attempi to secure evidence bearing 
upon this question, we correlated results on 
the anxiety scale with performance on a bat- 
tery of college aptitude tests and with aca- 
demic achievement. 


Method 


Subjects. Forty male undergraduates par- 
ticipated in this study. They were selected 
from among a larger number of students who 
had volunteered to serve in an extensive series 
of psychological studies. Their selection was 
on the basis that the group include as much 
diversification as possible, within the limits 


2It should not be overlooked that an unambiguous 
answer to this question may be difficult, or impos- 
sible, to attain since anxiety could conceivably affect 
performance on “ability” or “intelligence” tests. 
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of a college population, in regard to socio- 
economic status, interests, abilities, and extra- 
curricular activities. 

Word association test. This test consisted 
of a list of 100 nouns, selected as representa- 
tive of a wide variety of attitudes, beliefs, and 
personality traits. Ten of the words were de- 
signed to relate to anxiety or fear. The words 
were presented to Ss by means of a tape re- 
corder. The Ss were supplied with numbered 
data sheets and were instructed to write down 
as many associations as possible to .each of 
the 100 stimulus words. The words were 
played one at a time from the tape recorder, 
and 20 seconds were allowed after each word 
for Ss’ chained associations. 

Administration of the manifest anxiety 
scale. In a session following the word associa- 
tion test, Ss were individually administered 
the anxiety scale. The items selected by Tay- 
lor (8) were interspersed among the items 
from the hysteria, psychasthenia, lie, and K 
scales of the MMPI. Scores on the anxiety 
scale for the 40 Ss in our sample ranged from 
2 to 41 with a median of 17. 

Measures of scholastic ability. Transcripts 
of Ss’ academic records were secured from the 
registrar’s office, and they were rank ordered 
in terms of their grade-point average. Let- 
ter grades, with plusses and minuses taken 
into account, were transformed to numerical 
equivalents employing a scoring scheme that 
ranged from 1 to 12, with “A” equal to 1 and 
“E” equal to 12. Grade-point averages for the 
40 Ss ranged from 1 (A) to 8 (C —), with 
a median of 5 (B —), indicating considerable 
diversity in academic achievement. A separate 
rank order was determined on the basis of Ss’ 
performance on the college entrance examina- 
tions. Here again pronounced variability was 
evident, with scores ranging from 474 to 730 
and a median of 611. 


Results and Discussion 


Table 1 contains correlations between scores 
on the anxiety scale and several measures 
of performance on the word association test. 
The coefficients in the first row are based 
upon the entire series of 100 stimulus words, 
while the coefficients in the second row are 
based only upon responses to the 90 nonanx- 
iety stimulus words. In the first column of 
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Table 1 


Product-Moment Correlations Between Manifest 
Anxiety Scores and Word Association 
Test Performance 


(N = 40) 
Percent- 
Number Number age of 
of asso- ofanxiety anxiety 
Measure ciations responses responses 
100 Stimulus words AS** se 34* 
90 Nonanxiety 
stimulus words A0** A2** so” 
* Significant beyond the .05 level. 
** Significant beyond the .01 level. 


this table, it can be seen that scores on the 
anxiety scale are directly related to produc- 
tivity of associations. Whether or not the 10 
anxiety stimulus words are included in the 
computations, high anxiety scores are cor- 
related positively with a high number of as- 
sociations. These relationships are significant 
beyond the .01 level. 

The magnitude of the differences in number 
of associations can be seen when a compari- 
son is made between the means obtained by 
the 20 highest Ss and 20 lowest Ss on the 
anxiety scale. The mean number of associa- 
tions given by the high group was 530 and 
the mean number given by the low group was 
454. Comparison of the difference between 
these means results in a ¢ of 3.21 which is 
significant beyond the .001 level for a one- 
tailed test. These findings are in keeping with 
the expectation that high drive level, as meas- 
ured by the anxiety scale, increases the num- 
ber of responses that are above threshold. 

While our major concern was whether the 
anxiety scale would relate to productivity of 
associations, the present data afford an op- 
portunity to examine the relation between 
anxiety measured by the Taylor scale and 
anxiety assessed by the word association tech- 
nique. Since the Taylor scale contains a large 
number of items concerning subjective feel- 
ings of anxiety, it seems plausible to expect 
that Ss scoring high on this inventory would 
evidence a relatively large amount of verbal 
content indicative of anxiety and anxiety 
ideation. To explore this possibility, we com- 
puted the number of associations given by 
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each S that referred to anxiety ideation.’ The 
coefficients in the second column of Table 1 
show the degree of correlation between anx- 
iety scores and number of anxiety responses 
elicited from the different Ss. Again relations 
significant beyond the .01 level are obtained 
whether or not the 10 anxiety stimulus words 
are included in the computations. The Ss with 
high manifest anxiety scores tend to give more 
anxiety associations than do the low-scoring 
Ss. 

This latter finding holds true even when 
the number of anxiety responses is corrected 
for differences in total productivity. In column 
3 of Table 1 the number of anxiety responses 
has been expressed as a percentage of the in- 
dividual’s total number of responses, and the 
obtained correlations remain above the .05 
level of statistical significance. As an addi- 
tional control for the influence of total pro- 
ductivity on the relation between the manifest 
anxiety scale and anxiety associations, partial 
correlations were computed between these two 
measures of anxiety, with the total number of 
associations held constant. In response to the 
entire word association test and the 90 non- 
anxiety stimulus words, the partial correla- 
tion coefficients are significant at the .05 level 
(.36 and .31), providing clear-cut evidence 
of a positive relation between these two di- 
verse approaches to the measurement of 
anxiety. 

The second purpose in this study was to 
determine whether the relations that have 
been reported between the Taylor anxiety 
scale and various performance measures might 
not have been due in part to a correlation of 
this scale with intelligence. With measures of 
scholastic aptitude and achievement as indices 
of intellectual capacity, college entrance ex- 
amination scores and grade-point averages 
were correlated with scores on the Taylor 
scale. It was found that anxiety scores cor- 
related — .02 with entrance examinations and 
.17 with grade-point averages. Thus, in 
neither case did the coefficient approach sta- 
tistical significance. These two measures of 
intellectual ability were also correlated with 
word association test productivity and, again, 


8 The percentage of agreement between two inde- 
pendent raters’ scoring of anxiety responses for 20 
Ss was 91. 
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there was no evidence of significant relation- 
ships. 

This failure to obtain relationships even ap- 
proaching significance between results on the 
anxiety scale and measures of intellectual 
ability strongly suggests that scores on the 
scale are largely independent of intelligence. 
Of course, it must be realized that these cor- 
relations were obtained on the basis of re- 
sults secured from an intellectually homo- 
geneous population. And it should not be 
overlooked that if the sample contained a 
representative proportion of the general. popu- 
lation a significant relation between anxiety 
score and intelligence might be demonstrated. 
But, in this regard, it is important to note 
that most of the findings reported with the 
anxiety scale have been obtained with col- 
lege students similar to those employed in the 
present study. Therefore, it would seem safe 
to conclude that the demonstrated differences 
in performance between Ss who score high 
and low on this scale of manifest anxiety are 
probably not attributable to differences in 
intellectual ability. 


Summary 

In the present study, scores on the Taylor 
scale of manifest anxiety were correlated with 
performance on a 100-word chained associa- 
tion test. Supporting the prediction based on 
the supposition that anxiety measures drive, 
significant positive correlations were found 
between anxiety scores and productivity of as- 
sociations. It was also found that Ss scoring 
high on the anxiety inventory tended to give 
relatively more associations containing anx- 
iety ideation than did low-scoring Ss. Cor- 
relation of anxiety scores and association pro- 
ductivity with grade-point averages and per- 
formance on college entrance examinations 
indicated that, in the present sample, both 
anxiety and productivity were independent of 
these measures of intelligence. 
Received March 21, 1955. 
Early Publication. 
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A Failure to Replicate the Finding of a Negative 
Correlation Between Manifest Anxiety 
and ACE Scores 


R. E. Schulz and Allen D. Calvin 


Michigan State College 


In a recent study which appeared in this 
journal Matarazzo, Ulett, Guze, and Saslow 
report (1, p. 203), “Therefore it can be con- 
cluded that anxiety appears to be moderately, 
although significantly, correlated with total 
ACE score and that the higher the anxiety 
level the lower, on the average, is the ob- 
tained score on this test.” They obtained a 
Pearson r of — .25 which with an N of 101 
was significant at the 1% level of confidence. 
This finding of course is of considerable im- 
portance because of the large number of 
studies which have recently appeared in the 
literature in which the Taylor Manifest Anx- 
iety Scale (A scale) was used to separate 
anxious from nonanxious Ss. If the A scale is 
related to the ACE then the differences on 
learning problems, perceptual tasks, etc. ob- 
tained between anxious and nonanxious Ss 
chosen on the basis of the A scale may be 
merely due to intelligence * rather than anx- 
iety. With these facts in mind we decided 
that an attempt should be made to substanti- 
ate their findings. 

Our sample originally consisted of 99 
Michigan State College students, but one 
student was not included in our analysis be- 
cause of his extremely low ACE ?” score of 27. 


1 Matarazzo et al. failed to find a significant re- 
lationship between the Taylor and the CVS. They 
attribute this lack of relationship to the fact that 
the CVS is not a timed intelligence test while the 
ACE is. 

2 The raw ACE score is not given to a student at 
Michigan State College; instead, a rank is assigned. 
Therefore, in making our analysis we used the mid- 
point of the raw scores of each rank except for the 
highest rank where, of course, only the lower limit 
was used. 


Following the procedure utilized by Mata- 
razzo et al., we divided our Ss into four 
groups on the basis of their A-scale scores. 
Group I consisted of those Ss with A-scale 
scores from 1-8; Group II, 9-16; Group III, 
17-23, and Group IV, those Ss with A-scale 
scores of 24 and higher. A Pearson r was 
computed, and a correlation of .02 was ob- 
tained. This practically zero relationship fails 
to confirm the findings of Matarazzo et al. 
The present authors felt that a more precise 
estimate might be obtained if the actual A- 
scale scores for each S were utilized instead 
of grouping them into four categories as 
Matarazzo et al. did. We therefore computed 
an r using individual A-scale scores, but 
again we obtained a zero correlation. 

In order to test for rectilinearity an eta was 
computed, and a value of .10 was obtained 
which is also not significantly different from 
zero. A x’ test of goodness of fit was made, 
and the resulting ,’ fell far short of signifi- 
cance indicating a rectilinear relationship.* 
This supports the findings of Matarazzo 
et al. who also report a rectilinear relation- 
ship. 

Why did we find a low positive relation- 
ship which does not differ significantly from 
zero while Matarazzo et al. report that they 
found a significant negative relationship? One 
possibility is that our Ss had a different dis- 
tribution of Taylor scores than theirs. They 
give the following distribution of anxiety 
scores: Group I had 13 Ss, Group II, 47 Ss, 
Group III, 23 Ss, and Group IV, 18 Ss. Our 


8 Since the r and eta we obtained are both insig- 
nificant, our x” test, of course, is not too meaningful 
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Ss scored as follows: Group I, 24 Ss, Group 
II, 44 Ss, Group III, 22 Ss, and Group IV, 
8 Ss. A x* was computed, and since it fell 
short of significance (p > .05) the null hy- 
pothesis tends to be confirmed. Another pos- 
sibility is that the difference between their 
findings and ours could be attributed to 
variations in selection procedures which might 
result in a different type of student body at 
the respective institutions. However, until 
more evidence appears to support the results 


R. E. Schulz and Allen D. Calvin 


of Matarazzo et al. it would seem that a 
valid relationship between intelligence and 
scores on the Taylor Manifest Anxiety Scale 
has yet to be established. 
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Influence of Socioeconomic Status on Wechsler 
Intelligence Scale for Children: Addendum 


Betsy Worth Estes 
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The present study is a continuation of “In- 
fluence of Socioeconomic Status on Wechsler 
Intelligence Scale for Children: An Explora- 
tory Study” (1). 

In the previous study there were two groups 
of subjects: one aged 7 years, plus or minus 
2 months, in the second grade (NV = 40); the 
other aged 10 years, plus or minus 2 months, 
in the fifth grade (N = 40). Each group was 
further divided by socioeconomic status into 
two groups, one of upper economic and one 
of lower economic status. Each group had an 
equal number of boys and girls. Mean scores 
on the WISC did not vary with socioeconomic 
level in the fifth grade but did vary in the 
second grade. In the second graders, the upper 
group exceeded the scores of the lower group 
at the 1 per cent level. Several possible ex- 
planations for this decrease in the effects of 
socioeconomic status with increase in age and 
grade were offered. 

In 1953-1954, the previous second graders 
were in the fifth grade; therefore, it was de- 
cided to retest them. 


Subjects and Procedure 


Some of the subjects were unavailable as 
they had moved out of the state. In the 1950- 
1951 study there were 20 in the upper group 
and 20 in the lower. In the 1953-1954 study 
there were 18 in the upper and 14 in the 
lower. These latter subjects who were avail- 
able for the present study had shown a sig- 
nificant difference in the previous study be- 
tween the upper and lower groups. The WISC 
was administered to each child. 


Results 


There was an increase in mean IQ score for 


each economic group, but the difference be- 


tween the groups was not statistically signifi 
cant, as shown in Table 1. The scaled 
of the WISC also showed no significant 


ference in the 1953-1954 study. 


Table 1 
WISC Intelligence Quotients of T 


Socioeconomic Groups 


Socio- 1950-1951 1953 ; 
economic . 
group Mean SD t Mear 
Upper 114.50 9.49 119.44 
3.02** 
Lower 104.21 9.05 } 5 


-—_ ; nif - . ™ +) + . 7 
Significant at less than el 


Discussion 
The same result was obtained in the later 
study as in the first one: a decrease in the 
effects of socioeconomic status with increase 
in age and grade. The evidence is strength- 


. oe 


ened by the demonstration that the same chil- 


dren who differed in the second grade did 1 
differ in the fifth grade. 
It must not be overlooked that the means 


of the upper group on both IQ’s and scaled 
scores are higher than the means of the lower 
group. However, by the fifth grade, they did 
not differ significantly for these subjects 
The possible explanations offered in the first 
study may still be held: selective factors in 
promotion, the public schools, movies, Sunday 
schools, etc. exerting an increasing effect with 
advancing age and thereby narrowing the dif- 
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ferences between socioeconomic levels, or fac- 
tors within the WISC itself which operate to 
reduce socioeconomic effects with age. 


Summary 


Two groups of subjects, 18 in the upper 
socioeconomic group, 14 in the lower group, 
were retested on the WISC. The significant 
difference between the two groups which was 
found three years earlier when they were in 
the second grade no longer existed. With in- 
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crease in age and grade, there was a decrease 
in difference between groups as measured by 
mean IQ and mean scaled scores on the 
WISC. 
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This study is an outgrowth of research on 
the development and selection of measures 
for a battery designed to establish a psycho- 
logical prognosis for schizophrenic patients. 
A preliminary report by Zaslow (2), in which 
he described a sorting test used to investi- 
gate disturbances in concept formation in a 
group of schizophrenic subjects, was felt to 
be sufficiently encouraging to warrant fur- 
ther investigation. Although a recent article 
by Stacey and Cantor (1) supplied data on 
subnormal adolescents, to date no study has 
reported on the reliability of the procedure. 
To this end, an attempt was made in the 
present study to evaluate the test’s reliability 
with two populations, as well as to investigate 
its validity by repeating the test on three 
sample groups. 


Method 


The test consists of 14 geometric designs, 
each drawn on a white card 2 inches square. 
The first design is an equilateral triangle with 
sides of 114 inches; the design on the last 
card is a true circle with a diameter of 154 
inches. The 12 intermediate cards form a 
continuum in which the sides of the triangle 


1This research was facilitated by a grant from 
the National Institute of Mental Health, Public 
Health Service, Department of Health, Education, 
and Welfare, Project M 586C. The authors are in- 
debted to Professor Joseph Zubin, Principal In- 
vestigator, for his invaluable advice and guidance. 
We wish to thank Ethel Schmid and Herbert Green- 
wald for their kind assistance with the testing. 

2 Member of the Project staff at the time of this 
investigation. 


become increasingly circular and finally ter- 
minate in the true circle of card 14. 


The test was administered individually to three 
groups. The first group consisted of 45 patients in 
the early stages of schizophrenia, 26 females and 19 
males, randomly selected from the admission and 
convalescent wards of the New York State Psychi- 
atric Institute. None of these patients had been in- 
stitutionalized for more than 20 months at the time 
of testing. The mean period of hospitalization was 
4.6 months. Ages ranged from 14 to 55 years, with 
a mean of 25.8 years. The educational level ranged 
from 8 to 20 years with a mean value of 12.5 years. 
All patients included in this group were diagnosed 
as schizophrenics of various types, though the ma- 
jority fell into the paranoid classification. 

The second experimental group was obtained from 
the reception wards of Brooklyn State Hospital. 
They were chosen at random by the psychiatric 
staff from the recently admitted patients who had 
a diagnosis of schizophrenia. Their ages ranged from 
16 to 51 years, with a mean of 31.1 years. Their 
educational level ranged from 6 to 16 years with a 
mean of 10.8 years. The 23 patients in this group 
comprised 13 females and 10 males. 


These two patient populations may be con- 
trasted in the following respects: the Psychi- 
atric Institute patients were all voluntary ad- 
missions, capable of adapting to a permissive 
hospital milieu, of adequately caring for 
themselves, and in general did not pose a 
nursing problem. They were admitted to the 
hospital largely on the basis of their having 
a favorable prognosis. The sample drawn 
from Brooklyn State Hospital is more repre- 
sentative of the admissions usually found in 
a large state hospital. These patients mani- 
fest overtly bizarre behavior and many of 
them pose serious management problems. At 
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least one-third of this group has a bistory of 
one or more previous hospitalizations for a 
mental disorder. 


A group of 23 college students, 8 females and 15 
males, attending summer session courses at a large 
nearby university were employed as controls. These 
subjects were regarded as normal on the basis of 
their having no clinically manifest neurosis or psy- 
None were under medical or psychological 
treatment at the time of testing, nor had they ever 
been hospitalized for a mental disorder. 


chosis 


A test procedure consisting of the follow- 
ing four major parts was employed: 


Part A. The 14 cards are scattered on a table be- 
fore the subject, who is then requested to arrange 
them in one row in the best possible manner 

Part B. The true triangle and the true circle are 
placed at some distance from each other and the 
subject is requested to arrange the remaining cards 
which are scattered randomly as in Part A) be- 
tween these two in proper order so as to form a 
continuum from triangularity to circularity. 

Part C. The examiner arranges the cards correctly 
on the continuum and the subject is requested to 
indicate how many cards belong in the triangle 
group and how many cards belong in the circle 
group. This establishes conceptual boundaries of a 
certain magnitude for these two concepts 

Part D. The subject is requested to remove all 
cards which are neither triangles nor circles. This 
measures the ability to maintain the previously es- 


lished conceptual boundaries 


Results 


In the absence of an alternate form of the 
test, and the inapplicability of the split-half 
method, the test-retest technique was em- 
ployed to determine the dependability of the 
instrument over various intervening periods 
of time. The performances of all subjects in- 
cluded in the study were rated on the follow- 
ing parts of the test. 


Part A. There are three possible levels of perform- 
ance for which scores have been determined: (a) 
the arranging of the randomly placed cards along a 
continuum extending from triangularity to circu- 
larity or vice versa for a performance on the highest 
level; (5) grouping the cards in accordance with 
the three basic concepts of triangle, circle, and mid- 
dle area, for an intermediate level of conceptualiza- 
tion; (c) arranging the cards as simple pairs or pat- 
terns, or with no consistency at all, for a primitive 
level. This portion of the test was not included in 
the computation of the reliability coefficients be- 
cause of its gross susceptibility to practice effect. 

Pert B. A perfect score is obtained when each 
ard is placed in its proper sequence along the con- 
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tinuum. A quantitative performance score is ob- 
tained by calculating the discrepancy between the 
expected order of the cards and the obtained order 
and squaring and summing these discrepancies. 

Part C. This yields three scores based on the 
point at which the subject establishes his conceptual 
boundaries, C‘ or the card with which the subject 
establishes the triangle boundary, C*° or where the 
boundary of the circle was placed, and C" or the 
number of cards remaining in the intermediate range, 
being classified as neither triangles nor circles. 

Part D. This gives a measure of the instability and 
interpenetration of boundaries established in Part C, 
by determining whether the boundaries C‘ and C* 
are maintained when the subject is requested to re- 
move those cards which he considers to be neither 
triangles nor This recorded as a 


circles. score is 


“ves” or “no” entry. 

The reliability coefficients for the various 
portions of the test and for the two groups 
which were retested are shown in Table 1. 

Table 1 


lest-Retest Correlation Coefficients for 
Zaslow Test Scores 


Psychiatric 


Institute 
Normals patients 
Scor (N = 16) (N =21) 
B 14 46 
Ct 43 14 
( .90** 33 
Cr A6* 24 
D 42 .62t 


ant at .05 level. 

at .01 level. 

r. The correlation of .62 for Part D was 

icant at the .01 level if the test of significance 

y When tested by a nonparametric x’, which 
not assume a normal distribution, the relationship is sig- 

nificant at about the .15 level. 








as 


The mean retest interval for the normal 
group was 4.4 days, and for the Psychiatric 
Institute group it was 14.3 days. 

The performance of the three sample 
populations is summarized in Table 2. A chi- 
square analysis was undertaken to determine 
whether any of the obtained scores were sig- 
nificantly differentiating. Only the first, or 
Part A score (performance employing high- 
level conceptualization) was found to be sta- 
tistically significant in that a significantly 
higher percentage of the Normal subjects 
than of the Brooklyn State group employed 
high-level conceptualization in Part A. Also, 
a significantly higher percentage of the Psy- 





7) 


D; 
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Table 2 


A Comparison of the Normal, Psychiatric Institute, 
and Brooklyn State Patient Groups on 
the Various Test Scores 


Psychi- 
atric Brook- 
Normals Institute lyn 
Item (N=23) (N=45) (N=23) 


Percentage of group 

employing high-level 

conceptualization 61 67 17* 
Percentage of group 

placing triangle 

between cards 3 and 5 13 38 30 
Percentage of group 

scoring 25 or higher 

in Part B 0 9 13 
Percentage of group 

placing circle at 

cards 11 and 12 26 42 45 
Percentage of group 

including “‘middle” area 

of four to eight cards 26 20 30 
Percentage of group 

maintaining boundaries 

in Part D 35 29 17 








* Differs at the .01 level from Normals and from Psychiatric 
Institute group. 


chiatric Institute group than of the Brooklyn 
State patients employed a high-level concep- 
tualization in Part A. The difference between 
the Normal and Psychiatric Institute group 
is not statistically significant on this score. 
No other intergroup comparison on any of 
the five remaining scores yielded a statisti- 
cally significant difference. 


Discussion 


The results of this experiment indicate that 
the Part B and C* scores lacked reliability for 
both of the populations retested. The other 
three scores, Parts C*, C™, and D, showed 
statistically significant reliabilities for one or 
the other of the two groups, but not for both 
on any single portion of the test. These find- 
ings would indicate that the test as a whole 
is not sufficiently reliable for ordinary use. 

A comparison of the three groups, with one 
exception, failed to yield scores on any por- 
tion of the test that would significantly dif- 


ferentiate the normal from the experimental 
groups. It was found that on Part A, the dif- 
ference between the Normal and Brooklyn 
State sample was significant at the .01 level, 
but the same score did not differentiate the 
Psychiatric Institute group from the Nor- 
mals. Since only one of six scores was found 
to distinguish between the experimental and 
control groups, we would conclude that the 
differential power of the instrument for the 
populations under comparison is very que 
tionable 

Thus the results of this experiment indi 
cate rather low reliability, and very question 
able validity for Zaslow’s test. As originally 
developed the test differentiated between a 
control and hospitalized chronic schizophreni 
group. It appears to be inadequate for use in 
distinguishing between a normal control group 
and schizophrenics seen early in their hos- 
pitalization. 


Summary 


The Zaslow test of concept formation was 
administered to a group of normals, and two 
dissimilar groups of schizophrenics. The re- 
sults were analyzed in the same manner as 
that used by Zaslow in his study. The find- 
ings of this study indicate that the reliability 
coefficients obtained from the normal and 
the schizophrenic group were not statistically 
significant for both groups on any single por- 
tion of the test. Only one of six scores was 
found to distinguish between the experimen- 
tal and control groups. Thus, these results 
would indicate rather low reliability, and 
very questionable validity for Zaslow’s test 
for use in differentiating between normals 
and hospitalized schizophrenics seen early in 
their hospitalization. 
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There is little research literature on the sev- 
eral “nonpersonality” variables which may af- 
fect performance on visual-motor tests. It is 
the purpose of this paper to offer evidence 
pertaining to the operation of the variables 
of item difficulty and order of presentation 
in producing differences in recalling designs 
from memory. It is possible that many of the 
differences in drawing performance among 
clinical groups that are currently ascribed to 
organic pathology or personality may be ac- 
counted for more economically within the 
framework of well-known serial learning 
principles. 

In the conventional administration of the 
Bender-Gestalt Test (B-G), each of the nine 
designs is presented singly in a prescribed 
order and copied by the subject. A minimum 
of formal instruction is used and no time lim- 
its are imposed. Immediately following the 
last drawing, the materials are removed and 
the S may be asked to reproduce as many of 
the designs as he can recall (4). This tech- 
nique affords a measure of the learning tak- 
ing place during the copying phase of the test, 
and fits the paradigm of one-trial serial learn- 
ing under free recall conditions, with the im- 
portant exception that the S is not instructed 
to learn. 

Serial position effects have been studied 
extensively in verbal learning situations and 
have been found to be an important source 


1 The authors wish to acknowledge their indebted- 
ness to Dr. I. E. Farber of the State University of 
Iowa for his generous assistance in the planning and 
execution of this research as well as in the prepara- 
tion of this manuscript. 

2 Now at Worcester State Hospital. 

3 Now at Yale University. 


of variance in the recall of learned material. 
The usual paradigm for such studies involves 
successive exposures of the material until 
some criterion of learning is reached. Reten- 
tion is then tested by some form of recall. It 
should be noted that the B-G procedure al- 
lows only one presentation of the materials 
with a free recall. No studies involving visual- 
motor performance approximating this para- 
digm have been found in the literature, and 
only one verbal learning study seems suffi- 
ciently similar to this procedure to serve as a 
guide for hypotheses concerning serial posi- 
tion effects on this task. Welch and Burnett 
(5) reported a study in which a list of non- 
sense syllables was presented once with in- 
structions to learn. Testing by free recall was 
made immediately following the tachistoscopic 
presentations. The curves they reported for 
lists of eight syllables were interpreted as 
showing frequency of recall to be an increas- 
ing monotonic function of serial position in 
the list. 

The effect of the variable of serial position 
upon recall can be masked by the operation 
of many other variables, including design diffi- 
culty. Although no experimental evidence is 
available as to the presence or extent of in- 
teritem differences in difficulty level on the 
B-G, the work of experimenters like Benton 
(2) strongly suggests that such differences in 
design difficulty on visual retention tasks may 
be appreciable. If this source of variance is 
important, it could obscure the relative con- 
tribution of serial position in determining re- 
call frequency. 

It is highly probable that many variables 
other than those discussed above affect the 
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recall of the Bender designs. Age, sex, previ- 
ous training, degree of maladjustment and 
other individual differences variables would 
seem to be of possible importance. The pres- 
ent paper is restricted to a relatively homo- 
geneous population in terms of these variables. 
The major purpose of the present investiga- 
tion was to study the effects of serial position 
and design difficulty upon the recall of the 
B-G designs. The experimental design em- 
ployed, however, also afforded the opportunity 
of studying sex differences on this task. 


Experiment I 
Method 


Subjects. Fifty-four undergraduates, 27 
males and 27 females, from an advanced un- 
dergraduate course in psychology during the 
Spring, 1953, semester at the State Univer- 
sity of Iowa served as Ss. 

Procedure. In order to study the effects of 
serial position, nine arrangements of design 
presentation were used; each arrangement in- 
volved changing the serial position of the de- 
signs while holding the sequence constant.* 
Each arrangement was presented to three 
male and three female Ss in an individual 
testing session. Each S was given a lead 
pencil and a stack of 54% by 81-inch white 
paper and then shown the designs one by one. 
As part of the standard instructions, E ex- 
plained that the designs would be presented 
individually and that each design was to be 
copied on a separate page within a 40-second 
time limit. 

Following a six-minute interpolated draw- 
a-person task,® each S was asked to draw as 
many of the original designs as he could re- 
member, each design on a separate sheet of 
paper. The E stopped the recall session after 
eight minutes unless § stated he could recall 
no more designs before this time had elapsed. 

Scoring. All recalled designs were scored by 
a single author as acceptable or unacceptable 
according to an arbitrary scoring scheme de- 
vised by the authors. In general, the drawings 


#A-1-2 ... 8, 8-A-1- ... 7, and 7-8-A-.. . 6 
have the same sequence of items, but each design oc- 
curs in a different serial position. 

5 This study is concerned only with the recall 
drawings. No analysis of the original drawings was 
made. The draw-a-person task was used only to 
provide a uniform means of filling the interval be- 
tween the original drawing and recall. 
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Table 1 


Summary of the Analysis of Variance—Experiment I 


Mean 
Source of variance df square F 

Between Ss 53 
Sex (S) 1 052 
PXD s 27 1.17 
SXPxXD x 366 1.50 
Error (b) 36 244 

Within Ss 432 
Serial position (P) 8 333 1.76 
Design (D) ~ 1.644 8 6R** 
PXD 56 254 1.34* 
PXS & 116 61 
DxS 8 167 88 
SxXPx<D 56 179 94 
Error (w) 288 189 

Total 485 





*> <.001. 

*> <.05 
were required to be suitable in the form of 
the elements, free of distortions in line or dot 
quality, contiguity of elements, rotation, or 
other variations in the pattern (4). No lim- 
its were imposed on size, tone, or page ori- 
entation of the figures.® 

As a reliability check, the drawings were 
rescored by another of the authors, with 86.3 
per cent agreement in scoring. Each S’s draw- 
ings were scored as a set, rather than scoring 
each design for all Ss at one time. It is be- 
lieved that this latter procedure would have 
improved the reliability, but the former pro- 
cedure is more in line with 
practice. 


usual clinical 


Results 


The data, which consisted of the propor- 
tions of correctly recalled designs, were ana- 
lyzed by a Lindquist Type IV mixed design 
(3, pp. 285-288) in which the separate B-G 
designs (D) and serial positions (P) com- 
prised the Latin square factors and sex (S) 
was a between-Ss factor. In this experimental 
design, the individual designs and the serial 
positions are completely counterbalanced. Es- 
sentially, the experiment involved two replica- 
tions of the counterbalanced factors, one with 
male Ss and the other with female Ss. 

The results of the analysis of variance of 
these data are reported in Table 1. As can be 


© Mimeographed copies of these scoring standards 
may be obtained upon request. 
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Table 2 


Proportion of Ss Correctly Recalling Each Design 
and the SE of Each Proportion 








Designs 


Measure A 1 a1 8,44; 60 @ae x 


Proportion .85 
s 


SE prop 0: 





89 80 41 50 .67 63 44 .65 
04 05 07 07 06 07 07 07 





seen, no significant effect was found attribut- 
able to sex. Therefore, for subsequent analy- 
ses, the data for the two sexes were combined. 

The obtained F for design difficulty (D) 
was highly significant (p < .001). Table 2 re- 
ports the proportion of Ss correctly reproduc- 
ing each design and the SE of each propor- 
tion. The differences among the proportions 
for the individual designs suggest that there 
are three levels of difficulty with designs A, 
1, and 2 about equally easy; designs 5, 6, 
and 8 about equal at an intermediate level of 
difficulty; and designs 3, 4, and 7 the most 
difficult according to the scoring scheme used. 

The F test for serial position (P) in Table 
1 attained only a moderate level of statistical 
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Fig. 1. The proportion of Ss correctly recalling de- 
signs at each serial position in Experiment I. The in- 
dividual designs were presented equally often in each 
position counterbalancing the effect of design diffi- 
culty. The data from men and women were pooled. 
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significance (p < .08). The serial position ef- 
fects presented in Fig. 1, however, are so 
similar to the results obtained by Welch and 
Burnett (5) that further research seemed 
warranted. 

Although the obtained F for the within 
component of the “mixed” P x D interaction 
was barely significant (p < .05), the hy- 
pothesis of no interaction cannot be rejected 
in this type of experimental design unless 
either or both components of this interaction 
are significant at the .025 level of confidence 
(3, p. 280). In the present experiment, there- 
fore, the hypothesis of no interaction must be 
retained. 

The obtained results suggested that the re- 
call of a design might be affected not only by 
the difficulty level of the design but also by 
its serial position in a given order of pres- 
entation; that is, design difficulty might in- 
teract with order of presentation. To test this 
hypothesis, as well as to cross validate the re- 
sults as to sex and design difficulty, a second 
experiment was performed. 


Experiment II 
Method 


Procedure. Two orders of presentation were 
developed on the basis of the rough estimates 
obtained from Experiment I of the relative 
difficulty of the nine designs (Table 2) and 
of the nine serial positions (Fig. 1). Order I 
(3, 7, 4, 6, 8, 5, 2, A, 1) involved placing the 
most difficult designs in the most difficult (the 
initial) serial positions and the easiest de- 
signs in the easiest (the final) positions. Or- 
der II (1, A, 2, 5, 8, 6, 4, 7, 3) placed the 
most difficult designs in the easiest positions 
and the easiest designs in the most difficult 
positions. These orders permitted testing the 
interaction between difficulty of the designs 
and order of presentation. Each order was 
presented to 15 male and 15 female Ss. All 
other procedures were identical to those em- 
ployed in the first experiment. 

Subjects. Sixty undergraduates, 30 males 
and 30 females, from the introductory psy- 
chology course during the Fall, 1953, semester 
at the State University of Iowa served as Ss. 

Scoring. All recalled designs were scored 
independently by three of the authors, ac- 
cording to the same scoring scheme used in 
Experiment I. The scoring of an individual 
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design was the score given by the majority 
of the judges. The percentage of agreement 
for all three judges was 84.3; for the three 
possible pairs of scorers the percentages of 
agreement were 91.8, 89.7, and 87.1. 


Results 


The proportions of hard and easy designs 
correctly recalled under the two orders of 
presentation are shown in Fig. 2. The recall 
of designs appears to be a function of both 
design difficulty and order of presentation, 
that is, serial position. There was little dif- 
ference between the proportions of easy de- 
signs (A, 1, 2) correctly recalled at the hard 
(1-3) and easy (7-9) serial positions. The 
hard designs (3, 4, 7), however, were cor- 
rectly recalled by only 38 per cent of Ss when 
they appear in the hard positions, while in 
the easy positions they were recalled by 58 
per cent of Ss. Order of presentation appears 
to alter the recall of the hard items but has 
little or no effect on the easy items. 

These data were analyzed by a Lindquist 
Type III mixed design (3, pp. 281-284) in 
which sex (S) and the order of presentation 
(O) were between-Ss factors and the sepa- 
rate B-G designs (D) with a within-Ss fac- 
tor. The results of this analysis are reported 
in Table 3. 

As can be seen, the obtained F for the in- 
teraction between design and order of pres- 
entation was highly significant (p < .001), 
indicating that the recall of a given design is 
a function of doth its difficulty level and its 


Table 3 


Summary of the Analysis of Variance—Experiment II 

















Source of Mean 
variance df square F 
Between Ss 
Sex (S) 1 O11 
Order (O) 1 .010 
SxXO 1 153 
Error (b) 56 .201 
Within Ss 480 
Design (D) 8 2.860 16.22** 
DXxO 8 451 2.56** 
DxS 8 245 1.39 
DxXOxsSs 8 
Error (w) 448 176 
Total 540 





™* > <.001. 
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Fig. 2. The proportion of Ss correctly recalling the 
hard and easy designs in the two orders of presenta- 
tion. These curves are based upon the averages for 
the easiest serial positions (7-9) and the most diffi- 
cult serial positions (1-3). While only data from the 
hard (3, 4, 7) and easy (A, 1, 


2) designs are pre- 
sented here, the data from all designs were included 
in the statistical analysis. The data from men and 


women were pooled. 


serial position in a given order of presenta- 
tion. No other interactions were statistically 
significant. 

The obtained F for design difficulty was 
the only significant (p < .001) main effect, 
suggesting that difficulty level is the more im- 
portant variable affecting recall. No signifi- 
cant effect was found attributable to sex dif- 
ferences; therefore, in all the other analyses 
the data for the two sexes were combined. 
These results clearly confirm those of Experi- 
ment I. 


Discussion 


The procedure of this study deviated from 
the conventional clinical use of the B-G 
largely in the direction of greater precision 
and uniformity of stimulus presentation. 
Thus, the time spent on each design was 
held constant, the design exposures were fur- 
ther controlled by the use of individual sheets 
for copying each of the designs, and the in- 
structions were standardized. The interpola- 
tion of an interval filled with a drawing task 
was calculated to make the recall task more 








234 


difficult. This was done in order to get greater 
variability in the frequency of designs re- 
called in this group with high intellectual 
ability and to control the manner in which 
the interval was filled. Only the accuracy of 
the recalled designs was examined in the pres- 
ent study while clinical usage usually involves 
many more features of the drawings (4). 

It seems clear that these supposedly simple 
designs do differ in the frequency of recall 
even in this academically superior group. 
Little allowance is made for these design diffi- 
culty differences in the usual clinical inter- 
pretations of these drawings. Conclusions 
about a particular set of recall drawings 
should be tempered by adequate considera- 
tion of the variable of design difficulty. 

In the clinical use of the B-G a conven- 
tional order (A-1-2— ... 8) is employed. 
In this order the easiest designs are in the 
most difficult serial positions while the most 
difficult designs are in positions of intermedi- 
ate difficulty. It is not inconceivable that 
modifications in the conventional order might 
produce a more effective psychodiagnostic in- 
strument. 

The absence of any significant interaction 
effects between design difficulty and sex casts 
doubt on many of the clinical hunches which 
relate sexually relevant personality character- 
istics to particular design reproductions. The 
scoring for acceptability, however, may not 
be sufficiently sensitive to such distortions. 


Summary and Conclusions 


In an attempt to provide a firmer experi- 
mental foundation for the use of visual-motor 
tasks in the clinic, the effects of order of 
presentation, design difficulty, and sex on the 
free recall of the designs on the Bender 
Visual-Motor Gestalt Test (B-G) were in- 
vestigated. 

In Experiment I 54 college Ss, equally di- 
vided as to sex, were presented with one of 
the nine orders of the B-G designs, followed 
by a draw-a-person task and a free recall ses- 
sion. The recalled B-G designs were scored for 
accuracy and the frequency of correct recall 
was treated by an analysis of variance for de- 
sign difficulty, serial position, and sex. The ef- 
fect of design difficulty was highly significant, 
while the effect of serial position reached a 
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low level of statistical significance. The de- 
sign difficulty < serial position interaction ef- 
fects were equivocal and difficult to interpret. 

In Experiment II 60 college Ss, equally di- 
vided as to sex, were presented with two or- 
ders of the B-G designs. In one order the 
most difficult designs were placed in the most 
difficult serial positions while in the other or- 
der the most difficult designs were placed in 
the easiest serial positions; difficulty of de- 
signs and of serial position was estimated 
from Experiment I. The remainder of the 
procedure was identical to that of Experiment 
I. The designs were scored for accuracy and 
the frequency of correct recall was treated by 
an analysis of variance for design difficulty, 
order of presentation, and sex. The effects of 
design difficulty and the interaction between 
order of presentation and difficulty were the 
only statistically significant effects. 

The present investigation would seem to 
support the following conclusions: 

1. The B-G designs differ significantly in 
ease of recall, even in an intellectually su- 
perior population. 

2. B-G designs A, 1, and 2 appear to be the 
easiest; designs 3, 4, and 7 appear to be the 
most difficult; the remaining designs fall at 
an intermediate level of difficulty. 

3. The recall of a given design is a function 
of both its difficulty level and its serial po- 
sition in a given order of presentation. 

4. At the college level, males and females 
do not differ in total frequency of correct re- 
call, nor in frequency of correct reproduction 
of any particular design. 


Received October 22, 1954. 
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Allen, Robert M. Elements of Rorschach interpreta- 
tion. New York: International Universities Press, 
1954. Pp. v + 242. $4.00. 


Although this textbook on the use of the Ror- 
schach is presented at a rather elementary level, it 
can scarcely be claimed that the elements of Ror- 
schach interpretation receive a comprehensive treat- 
ment. Taking as his theme the statement that “every 
performance of a person is an expression of his whole 
personality, perception included,” the author pro- 
ceeds to offer personality interpretations for various 
modes of response to the Rorschach cards. In gen- 
eral, his discussions are rich in claims for the signifi- 
cance of Rorschach responses, but poor in rational 
and empirical justification for these claims. To a 
conspicuous degree, the textbook is a recital of the 
point of view, opinions, insights, and practices of 
the author. It does not appear to be an impersonal 
presentation of commonly accepted elemental bases 
for Rorschach interpretations—J. R. W. 


Deutsch, Felix, & Murphy, William F. The clinical 
interview. Vol. 1. Diagnosis. New York: Inter- 
national Universities Press, 1955. Pp. 613. $10.00. 


This is the first of two volumes on the clinical in- 
terview, and is devoted to diagnosis; the second 
volume, promised shortly, is to deal with the clini- 
cal interview and psychotherapy. Both are designed 
for the training of psychiatric residents. Judging 
from the published volume, they can also serve well 
in the training of clinical psychologists. The authors 
develop two interesting concepts: the “associative 
anamnesis,” which is a method of “guided free asso- 
ciation” for diagnostic purposes, and “sector psycho- 
therapy,” which is a goal-limited type of therapy. 
By means of 21 recorded interviews with differ- 
ent patients showing different syndromes, Deutsch 
and Murphy convey their technique in detail. Each 
interview is preceded by a discussion of the syndrome 
involved and a brief summary of the case history; 
the interviews are generously interlarded with com- 
ments and interpretations. Each interview is de- 
signed as a complete entity, and is expected to 
achieve a predetermined goal. A chapter on “The 





Note—The reviews were prepared by the Editor 
and the Advisory Editors, who may be identified by 
their initials. 
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Application of Psychoanalytic Concepts to Resident 


Training and the Technique of the Therapeutic In 


terview” is particularly worthwhile—M. K 


Donahue, Wilma T. (Ed.) Education for later ma 
turity: a handbook. New York: Whiteside, Inc 
& William Morrow, 1955. Pp. xiii + 8. $4.5 


Our machine age has produced increasing hours of 
leisure. At the same time there has been a rapid ex 
pansion of the life span, so that the proportion of 
older people in our society has increased. Adult edu 
cation finds a real challenge in meeting the needs 
and demands of these mature years. Education for 
Later Maturity presents the problems of the aging 
period, together with the present attempts to meet 


them. It stresses the guiding principles that must be 
established in order to make satisfactory solutions 
Adult education is envisaged not as a meeting of 
emergency needs, but as a lifelong process. If s 
conceived, the well adjusted years of maturity ar 


not a period requiring specific education, but are the 
culmination of a process of lifelong adjustment 
growth and change. Each chapter presents a sepa- 


rate aspect of the problem by different authors. The 
book is so carefully conceived and edited that ther 
is no overlapping. The reader has a feeling of unit 
and cohesiveness unusual in a book of this nature 
There is excellent coverage of all phases of adult 


education, and a careful review of the present pr 
gram, often with a frank evaluation of possibilit 
and limitations. The book will prove profitable read- 
ing for anyone who is working with mature 
or who is facing the problems of later maturity him- 
self —B. M. L. 





Edwards, Allen L. Statistical methods for the 
havioral sciences. New York: Rinehart, 1954 
Pp. vii + 542. $6.50. 

This is an elementary book in statistics. With re- 
spect to style, format, and content it falls clearly in 
the pattern of textbooks that have been accepted by 
departments of psychology for undergraduate courses 
in statistics. The area covered, however, exceeds the 
work that can be expected from a one-semester 
course at the undergraduate level. For the most part 
the treatment is modern and it should be noted that 
a chapter on significance tests for ranked data is in- 
cluded. It is likely that the reader will find this text 
to be an improvement over Edwards’ earlier text on 
statistical analysis—/. R. W 






236 





Macfarlane, Jean W., Allen, Lucille, & Honzik, Mar- 
jorie P. A developmental study of the behavior 
problems of normal children between twenty-one 
months and fourteen years. Berkeley and Los 
Angeles, Calif.: Univer. of California Press, 
1954. Pp. vii + 222 (paper). $2.25. 

This report is offered as a contribution to our 
knowledge of the problems and developmental cir- 
cumstances of children who are not the subject of 
special therapy or guidance. The writers suggest that 
the data are more descriptive of a “normal” popula- 
tion of families than are data based on families who 
come to clinics because of referrals. Although the 
work presumes to offer no more than a description 
and suffers the obvious limitations of an account of 
children who lived in one particular area (and of 
necessity at one particular time), it is a storehouse 
of information and may be a source of hypotheses 
for subsequent investigations and a catalog of bench 
marks useful for comparison and reference. The sam- 
ple comprised 126 infants who were examined at 
regular subsequent intervals, with diminution of 
numbers resulting from the usual factors which in- 
terfere with “follow-up” studies. One of the most 
interesting features of this publication is the pres- 
entation of the relative incidence of various types 
of “problem” behavior through the critical period 
beginning at age five and ending at age fourteen— 


a BW. 


Meerloo, J. A. M. The two faces of man. New York: 
International Universities Press, 1954. Pp. x+ 
237. $4.00 


This little volume contains two essentially inde- 
pendent long essays. One, entitled “Father Time,” is 
a thoughtful consideration of the psychology of time 
perception and the experience of duration and tem- 
poral relationships. The other, the title essay, is con- 
cerned with ambivalence and the psychological sig- 
nificance of the bipolarities and antinomies that seem 
omnipresent in the world of experience. In both 
essays, Meerloo demonstrates a rich background of 
clinical experience and a sensitive familiarity with 
the tensions that arise from the ambiguities of com- 
plex social life. His erudition is considerable, and his 
leisurely, learned pace sometimes masks a boldness 
of thought disciplined by a genuine feeling for evi- 
dence and logic in the testing of intuitive formula- 
tions. Psychologists will find much in this slim book 
to sharpen their clinical insights and to challenge 
their investigative resourcefulness. A Dutch psy- 
chiatrist now practicing in the United States, Dr. 
Meerloo is known through his previous books for 
his ingenuity in psychological conceptualization and 
his easy, civilized English style. The Two Faces of 
Man continues and justifies his reputation for in- 
novation and gentleness —E£. J. S. 


Remmers, H. H. Introduction to opinion and atti- 
tude measurement. New York: Harper, 1954. 
Pp. viii + 437. $5.00. 

Although written as a text for an undergraduate 
course in attitude measurement, this well-written 


New Books and Tests 


volume provides a useful survey of the methods and 
problems of attitude measurement for anyone whose 
training did not include such a course at either the 
undergraduate or graduate level. Part I, entitled 
“Techniques of Attitude and Opinion Measurement,” 
includes a discussion of all current techniques except 
those recently introduced by Coombs. Part II, the 
last half of the book, includes chapters on the ap- 
plication of opinion and attitude measurement in 


business, government, industry, community relations 
and education. Each chapter is concluded with a 
series of thought-provoking questions (many of 


which are not answered in the text!) and a se- 


lected bibliography —E. L. K. 


Robinson, Mary F., & Freeman, Walter. Psychosur- 
gery and the self. New York: Grune & Stratton, 
1954. Pp. ix + 118. $3.00. 

There is packed into this clear and carefully writ- 
ten 118-page book an explanation of psychosurgery 
including the less drastic transorbital lobotomy, de- 
scriptions of postlobotomy personalities, an objective 
and concise review of the literature on the procedure 
as well as current psychological concepts of the self 
This gives perspective to an experiment on the effects 
of the standard psychosurgery on the patient's self 
as compared with a control group of patients who 
recovered without the operation. Dr. Robinson’s 
basic hypothesis is that psychosurgery changes the 
structure of the self through reducing the capacity 
for the feeling of self-continuity. Her design pro- 
vides evidence substantiating the hypothesis, and her 
experimental adventure into this important but un- 
explored area of psychology should stimulate theory 
and research. Her methods seem intriguingly simple 
but penetrating, and her insights into the fundamen- 
tal changes after prefrontal operations are satisfying. 
The undesirable changes which are found along with 
reduced anxiety, introversion, and neuroticism are 
indicated throughout the book, and it is clearly 
stated that the study is investigative rather than 
evaluative. Nevertheless, the reviewer feels that, de- 
spite Dr. Robinson’s scientific objectivity and hon- 
esty, the flavor of the volume is definitely favorable 
to psychosurgery —F. McK. 


Sappenfield, Bert R. Personality dynamics. New 
York: Alfred A. Knopf, 1954. Pp. xiv + 412 + 
xvi. $5.50. 

In this text, the subtitle of which is “An Integra- 
tive Psychology of Adjustment,” the author has at- 
tempted, first, a systematic presentation of psycho- 
analytic principles and, second, an integration of 
psychodynamic principles with an organismic con- 
ception of behavior. Furthermore, although employ- 
ing much of Freud’s original terminology, the author 
attempts to redefine such terms in language more 
common to American psychology (the glossary con- 
tains almost 300 such definitions). In these efforts at 
integration, the author proposes what he regards as 
four contributions to conceptual clarification: (a) the 
definition of id functions as those involving biogenic 
need-tensions, ego functions as those involving cog- 
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nitive (intellectual) and voluntary processes, and su- 
perego functions as involving all aspects of psycho- 
genic motivation; (6) the analysis of motives into 
three components: need, instrumental act, and ob- 
ject; (c) equating the concepts “id functions” and 
anxiety and defining both as “the occurrence in con- 
sciousness of biogenic need-tensions without con- 
scious representation of instrumental acts or objects 
in relation to which gratification or relief may be 
anticipated”; (d) the distinction between perceptual 
identification and developmental identification. Em- 
phasis throughout the text is on the development 
and functioning of the normal personality, and al- 
though pathological conditions are often used by 
way of illustration, it is not a textbook in abnormal 
psychology. For example, the word “schizophrenia” 
does not appear in a twelve-page subject index. In- 
stead, it is a text which seeks within a single volume 
(a) to familiarize the student with all essential as- 
pects of psychoanalytic theory, and (b) to get him 
in the habit of using this theoretical formulation and 
terminology in thinking about his own behavior and 
that of his peers. As such it is not an easy text. I 
predict that undergraduates will find it interesting 
but difficult. Graduate students should find it a help- 
ful introduction, especially in view of the many ref- 
erences to original sources. The style is almost com- 
pletely didactic with practically no reference to the 
methodological problems of assessing dynamic proc- 
esses, to the variables involved, or to empirical 
studies bearing on the topics discussed ——E. L. K. 


Savage, Leonard J. The foundations of statistics. 
New York: Wiley, 1954. Pp. xv + 294. $6.00. 
Lest the prospective reader be misled by the title, 
it should be stated at once that this is not an intro- 
ductory text in statistics. In fact, it is not a text- 
book at all but rather a highly sophisticated dis- 
cussion of the philosophical and logical bases of 
statistics “as a discipline of rational decision in the 
fact of uncertainty.” The author develops and de- 
fends the currently somewhat unfashionable “person- 
alistic viewpoint” of probability and then discusses 
its implications for observation and experimentation, 
for decision making, and for current statistical theory 
and practice. The preface suggests that any but the 
casual reader should have some formal preparation 
in the theory of mathematical probability, at least a 
year of calculus and some training in formal logic. 
In addition, it is suggested that he should “sit bolt 
upright on a hard chair at a desk.” Even though the 
present reviewer does not pretend to have worked 
his way through all of the book in this manner, he 
did find the less mathematical chapters highly in- 
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formative and thought provoking. He believes that 
many other psychologists will find it worth a few 
hours of their time. A bibliography of some 170 
items is included —Z. L. K. 


Tests 


Drake, Raleigh M. Drake Musical Aptitude Tests 
Ages 8adult. 2 forms. (80) min. Microgroove 
phonograph record ($5.95) ; 
pads, musical memory ($2.45 per 20) 
($2.45 per 20); manual, pp. 32 (75¢) ; 

Chicago 


self-scoring answer 

rhythm 

specimen 

set ($6.95) Science Research Asso- 
ciates, 1954. 

The Drake tests are the outcome of a life career 
devoted to measurement in music, and are a sub- 
stantial contribution to their field. In place of the 
more familiar potpourri of short samples, these tests 
measure only two kinds of performance, at a length 


that permits adequate reliability (mainly 85 to .95) 
The two subtests have very low correlations with 
each other and with age and intelligence, and moder- 
ate correlations (about 35) with musical training 
The norms are remarkably stable among several na- 
tional and ethnic groups. Validity, unfortunately, is 
reported only in terms of correlations with teachers’ 
ratings, but such correlations are about as high as 
the reliabilities of the ratings would permit. The 
manual conforms to high standards in displaying ex- 
tensive and relevant data. Both musically and psy- 


chometrically, this is a good test—L. F. S 


Thorpe, Louis P., Meyers, Charles E., & Sea, Mar- 
cella R. What I like to do, an inventory of chil- 
dren’s interests. Grades 4-6. 1 form. Untimed 
(50-60) min. Hand or IBM scoring. Booklet 
($2.95 per 20); IBM answer sheet ($2.90 per 
100); profile sheet 20 
16 (25¢); specimen set (50¢) 
Research Associates, 1954. 

What I Like To Do is an uninspired but work- 
manlike job in the construction of an interest in- 
ventory for grades 4 to 6 and, by implication but 
without data, for grade 7. The 294 items assess in- 
terests in art, music, social studies, active play, quiet 
play, manual arts, home arts, and science. The items 
were developed from the literature on children’s in- 
terests, and were screened both by judgments and 
by an empirical trial. National norms are based on 

3,803 children in grades 4 to 6, and Kuder-Richard- 

son reliabilities of the separate parts center around 

87. A general factor of liking things seems to be 

present, as most of the scale intercorrelations exceed 

50—L. F.S. 


90¢ per manual, pp 
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PSYCHOLOGY 


Edited by A. A. ROBACK 


A definitive volume of 40 original contributions embracing practically 
the whole range of psychology from the neurological basis to military and 
parapsychology, each chapter written by an expert in his field expressly for 


this work. 
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I A PIONEERING WORK OF TIMELY IMPORTANCE 


BASIC CONCEPTS ? 
IN This is the first vocational 
iB guidance book to discuss at 
VOCATIONAL GUIDANCE J ‘coath those principles and 
techniques c* social case- 
by work as they apply to educa- 
HERBERT SANDERSON tional and yocational coun- 
Vocational Counseling Center seling. 
University of Buffale 
346 pages 


AlM The author's prety eS ee practicing counselors and students of counseliag 
to familiarize themselves with the fundamenta! principles in educational and voca- 
tional counseling. The book covers the theoretice! and practical aspects of the subject 

SCOPE as they may arise in working with both adolescen's and adults. It discusses the psy- 
chological forces that prompt the client to seek vocational help, the difficulties en- 

countered at the outset, the counseling process itself, the role of the counselor and his psychological 
needs and the ending phase. The relationship between vecational guidance and other helping 
disciplines is described in considerable detail, and the need for professional supervision is discussed. 
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well as anyone in the country what CAREERS 


kind of information teachers ond 
students of occupations need and 
want. 


by 
WALTER J. GREENLEAF 


U. S. Office of Education 
606 page= $4.20 


Organized in three parts: 
1, the individual—his interests, hobbies, local opportunities, how to study occupstions, and 
how to get a job. 
2. individual occupations—classified according to the ) ary af Occupational Titles—and 
discussed as to groups within each classification. 
8. descriptions of industrial precesses and occupations in the large major industries of our 
country. 
This text and reference book, filled with authentic and specific in{.; mation, iccludes detailed, precise 
lists, tables, figures, and data of practical help to students an: hers of occupations. 
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